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Polvketides and Their Synthesis 

Technical Field 

The present invention relates to processes and 
materials (including enzyme systems, nucleic acids, 
vectors and cultures) which can be used to influence the 
selection of acylthioester units for the synthesis of 
polyketides, and to the resulting polyketides, which may 
be novel. It is particularly concerned with macrolides, 
polyethers or polyenes and their preparation making use 
of recombinant synthesis. 

In preferred types of embodiment, polyketide 
biosynthetie genes or portions of them, which may be 
derived from different polyketide biosynthetic gene 
clusters, are manipulated to allow the production of 
specific polyketides, such as 12-, 14- and 16-membered 
macrolides, of predicted structure. The invention is 
particularly concerned with the modification of an Acyl 
CoA:ACP transferase (AT) function, generally by modifying 
genetic material encoding it ' in order to prepare 
polyketides with a predetermined ketide unit, e.g. 
incorporating (a) a malonate extender unit; or (b) a 
methylmalonate extender unit; or (c) an ethylmalonate 
extender unit; or (d) a further type of extender unit; or 
(e) an acetate and/or malonate starter unit; or (f) a 
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propionate and/or methylmalonate starter unit; or (g) a 
butyrate and/or ethylmalonate starter unit; or (h) a 
further type of starter unit. Of course the invention can 
be used to influence more than one ketide unit of a 
5 polyketide. The method enables one to minimise 

alteration to the protein structure of the polyketide 
synthase . 

Polyketides are a large and structurally diverse 
class of natural products that includes many compounds 
10 possessing antibiotic or other pharmacological 
properties, such as erythromycin, tetracyclines, 
rapamycin, avermectin, monensin, epothilone and FK506. 
In particular, polyketides are abundantly produced by 
Streptomyces and related actinomycete bacteria. They are 

15 synthesised by the repeated stepwise condensation of 
acylthioesters in a manner analogous to that of fatty 

acid biosynthesis. The structural diversity found among 

» 

natural polyketides arises in part from the selection of 
(usually) acetate (malonyl-CoA) or propionate 

20 (methylmalonyl-CoA) as "starter" or "extender" units 
(although one of a variety of other types of unit may 
occasionally be selected) ; as well as from the differing 
degree of processing of the (J-keto group formed after 
each condensation. Examples of processing steps include 

25 reduction to P-hydroxyacyl- , reduction followed by 
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dehydration to 2-enoyl-, and complete reduction to the 
saturated acylthioester . The stereochemical outcome of 
these processing steps is also specified for each cycle 

of chain extension. Methylation at the a-carbon or p- 

5 hydroxy is also sometimes observed. 

The biosynthesis of polyketides is performed by a 
group of chain- forming enzymes known as polyketide 
synthases. Two broad classes of polyketide synthase 
(PKS) have been described in actinomycetes. One class, 

10 named Type I PKSS/ represented by the PKSs for the 

macrolides erythromycin, oleandomycin, avermectin, and 
rapamycin and by the PKS for the polyether monensin, 
consists of a different set or "module" of enzymes for 
each cycle of polyketide chain extension. For an example 

15 see Figure 1 (Cortes, J. et al. Nature (1990) 348:176- 

178; Donadio, S. et al. Science (1991) 2523:675-679; 

Swan, D.G. et al. Mol. Gen. Genet. (1994) 242:358-362; 

MacNeil, D. J. et al. Gene (1992) 115:119-125; Schwecke, 

T. et al. Proc. Natl. Acad. Sci. USA (1995) 92:7839-7843; 

20 also Patent application W098/01546) . The genes encoding 
numerous Type I PKSs have been sequenced and these 
sequences disclosed in publicly available DNA and protein 
sequence databases including Genbank, Swissprot and EMBL. 
For example, the sequences are available for the PKSs 
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governing the synthesis of erythromycin (Cortes, J. et 

al. Nature (1990) 348:176-178); accession number X62569, 

Donadio, S. et al. Science (1991) 252:675-679; accession 

number M63677) ; rapamycin (Schwecke, T. et al. Proc. 

5 Natl. Acad. Sci. (1995) 92:7839-7843; accession number 
X86780) ; rifamycin (August, P.et al. Chem. Biol. (1998) 

5:69-79; accession number AF040570) and tylosin (Eli 
Lilly, accession number U78289) , among many others. 

The term "polyketide synthase" (PKS) as used herein 
10 refers to a complex of enzyme activities responsible for 
the biosynthesis of polyketides. These enzyme activities 

include p-ketoacyl ACP synthase (KS) , acyltransf erase 

(AT) , acyl carrier protein (ACP) , p-ketoreductase (KR) , 

dehydratase (DH) , enoyl reductase (ER) and thioesterase 
15 (TE) but are not limited to these activities. Each of 
these activities lies on a separate protein or 
polypeptide fragment responsible for this activity. Such 
a fragment is termed a "domain". The terms "motif" or 
"signature sequence" used herein refer to a small stretch 
20 of amino acids (usually less than 10 amino acids) within 
a domain responsible (at least in part) for one aspect of 
the catalytic function, for example, choice of substrate. 
The term "extension module" as used herein refers to the 

set of contiguous domains, from a p-ketoacyl- ACP synthase 
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("KS" ) domain to the next acyl carrier protein ("ACP") 
domain, which accomplishes one cycle of polyketide chain 
extension; this may or may not include domains 
responsible for the reductive processing of the 
5 polyketide chain. The term "loading module" is used to 
refer to any group of contiguous domains that 
accomplishes the loading of the starter unit onto the PKS 
and thus renders it available to the KS domain of a 
specific extension module. 

10 

Background Art 

Several approaches to altering the nature of the 
polyketide product of a PKS by genetic engineering have 
been proposed: see particularly WO 93/13663 and WO 

15 98/01571. The length of polyketide formed has been 
altered, in the case of erythromycin biosynthesis, by 
specific relocation using genetic engineering of the 
enzymatic domain of the erythromycin-producing PKS that 
contains the chain-releasing thioesterase/cyclase 

20 activity (CortSs, J.et al. Science (1995) 268:1487-1489; 

Kao, CM. et al. J. Am. Chem. Soc. (1995) 117:9105-9106). 

In- frame deletion of the DNA encoding part of the 
ketoreductase domain in module 5 of the erythromycin- 
producing PKS (also known as 6-deoxyerythronolide B 
25 synthase, DEBS) has been shown to lead to the formation 
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of erythromycin analogues 5, 6-dideoxy-3-a-mycarosyl-5- 
oxoerythronolide B, 5, 6-dideoxy-5-oxoerythronolide B and 

5. 6- dideoxy, 6 J}-epoxy-5-oxoerythronolide B (Donadio, S. 
et al. Science (1991) 252:675-679). Likewise, alteration 

5 of active site residues in the enoylreductase domain of 
module 4 in DEBS, by genetic engineering of the 
corresponding PKS- encoding DNA and its introduction into 
Saccharopolyspora erythraea, led to the production of 

6. 7- anhydroerythromycin C (Donadio, S. et al. Proc Natl. 

10 Acad. Sci. USA (1993) 90:7119-7123). 

Patent application WO 00/01827 describes further 
methods of manipulating a PKS to change the oxidation 

state of the P-carbon. Substituting the reductive domain 
of module 2 of the erythromycin-producing PKS with 
15 domains derived from rapamycin PKS modules 10 and 13 led 
to the formation of C10-C11 olefin-erythromycin A and 
C10-C11 dihydroerythromycin A respectively. 

The second class of PKS, named Type II PKSs, is 

« 

represented by the synthases for aromatic compounds. 
20 Type II PKSs contain only a single set of enzymatic 

activities for chain extension and these are re-used as 
appropriate in successive cycles (Bibb, M. J. et al. EMBO 

J. (1989) 8:2727-2736; Sherman, D. H. et al . EMBO J. 

(1989) 8:2717-2725; Fernandez -Moreno, M.A. et al. J. 
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Biol. Chem. (1992) 267:19278-19290). The "extender" 
units for the Type II PKSs are usually acetate (malonyl- 
CoA) units, and the presence of specific cyclases 
dictates the preferred pathway for cyclisation of the 
5 completed chain into an aromatic product (Hutchinson, C. 
R. and Fujii, I. Annu. Rev. Microbiol, (1995) 49:201- 
238) , Hybrid polyketides have been obtained by the 
introduction of cloned Type II PKS gene -containing DNA 
into another strain containing a different Type II PKS 
10 gene cluster, for example by introduction of DNA derived 
from the gene cluster for actinorhodin, a blue -pigmented 
polyketide from Streptomyces coelicolor, into an 

anthraquinone polyketide -producing strain of Streptomyces 

galileus (Bartel, P. L. et al. J. Bacterid . (1990) 

15 172:4816-4826). Occasionally, unusual starter units are 
incorporated by Type II PKS, particularly in the 
biosynthesis of oxytetracycline, frenolicin and 
daunorubicin and in these cases a separate AT is used to 
transfer the starter unit to the PKS. 

20 Fungal PKSs such as the 6-methylsalicylic acid or 

lovastatin PKS typically consist of a single multi -domain 
polypeptide which include most of the activities required 
for the synthesis of the polyketide portion of these 
molecules (Hutchinson C.R. and Fujii I. Annu. Rev. 

25 Microbiol. (1995) 49:201-238). Type II Fungal PKSs are 
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also known. 

A number of mixed systems comprising polyketide 
synthase and nonribosomal peptide synthase modules have 
been identified including the epothilone and bleomycin 
5 biosynthetic clusters. 

Although large numbers of therapeutically important 
polyketides have been identified, there remains a need to 
obtain novel polyketides that have enhanced properties or 
possess completely novel bioactivity. The complex 

10 polyketides produced by Type I PKSs are particularly 
valuable, in that they include compounds with known 
utility as anthelmintics, insecticides, anticancer, 
immunosuppressants, antifungal or antibacterial agents . 
Because of their structural complexity, such novel 

15 polyketides are not readily obtainable by total chemical 
synthesis, or by chemical modifications of known 
polyketides. Particular changes that are desired are 
changes to the carbon skeleton by altering the nature of 
the starter and/or extender unit(s) incorporated, changes 

20 to the oxidation level of the P-keto carbon and therefore 

the pattern of oxygen substituents by altering the series 

♦ 

of reductive steps that occur after chain extension and 
changes to the post PKS "tailoring" steps which generally 
comprise hydroxylation, methylation or glycosylation of 
25 the polyketide molecule. 
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There is also a need to develop reliable and 
specific ways of deploying individual modules in practice 
so that all, or a large fraction, of hybrid PKS genes 
that are constructed, are viable and produce the desired 
5 polyketide product. Various strategies have been 
described to produce these hybrid PKSs particularly 
utilising recombinant DNA technology and denovo 
biosynthesis. There is a particular need to develop 
methods of manipulating these PKS in a manner that 
10 minimises the alteration to the PKS protein structure. 
Existing methods of achieving these manipulations 
sometimes produce hybrid PKS multienzymes which give the 
desired product at only 1% or less of the rate that the 
unmodified PKS produces product. 

i 

15 WO 93/13663 and WO 98/01571 describe novel methods 

of engineering PKSs. A well-established method of 
altering the nature of the extender unit used at any 
position in the polyketide molecule, particularly 
malonyl-, methylmalonyl- or ethylmalonyl-CoA is by domain 

20 substitution. For example, W098/01546 and US patent 
6,063,561 disclose methods of accomplishing this 
modification to form modified erythromycins. Novel 
polyketide molecules, in this case particularly novel 
erythromycins, are produced by the replacement of an 

25 entire AT domain-encoding DNA fragment on the 
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Saccharopolyspora erythraea chromosome with an equivalent 

heterologous AT domain- encoding fragment from another PKS 
cluster. It is well known to those skilled in the art 
that selection of the exact DNA/protein splice sites into 

5 which to insert the heterologous domain requires detailed 
analysis of the corresponding DNA and protein sequences. 

Different researchers choose to use splice sites at 
conserved, semi -conserved or non- conserved regions of the 
_ protein, or at sites either within or at the boundaries 

10 of the AT domains. A further drawback of this technique 
is that it is hard to predict whether a particular 
heterologous domain will work in any given context. A 
domain that works successfully in one module may not work 
at all in an adjoining module or may produce polyketides 

15 at a vastly reduced yield. Oliynyk, M. et al. (Chem. 

Biol. (1996) 3:833-839) and Ruan et al. (J. Bact. (1997) 

179:6416-6425) have published studies that exchange a 
methylmalonyl-CoA specific AT domain for malonyl-CoA 
specific AT domains in modules of the erythromycin PKS. 
20 Products were observed only for changes in modules 1 and 
2, with module 2 at a vastly lowered yield. Stassi et 

al. (Proc. Natl. Acad. Sci. (1998) 95:7305-9) exchange 

the methylmalonyl-CoA specific AT of module 4 of the 
erythromycin PKS for an ethylmalonyl-CoA specific AT and 
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again product yield was low even after the addition of 
the crotonyl-CoA reductase gene thought to increase the 
supply of the required ethylmalonyl-CoA precursor. A 
possible reason for the limiting yields is the structural 

5 or mechanistic non-compatibility of a heterologous AT 
domain with the adjoining KS and ACP domains with which 
it must interact properly for efficient polyketide chain 
synthesis. Consequently, it is often necessary to try 
multiple domain swaps to achieve a novel polyketide- 

10 producing strain that displays adequate efficiency - a 

process made particularly arduous when these changes must 
be made by gene replacement on the chromosome through a 
two step double integration process. The introduction of 
splice sites at the DNA level is time consuming and 

15 technically challenging, requiring careful analysis to 
ensure the PKS protein coding reading frame is not 
disrupted. The introduction of restriction enzyme sites 
often requires changes at the amino acid level which lead 
to further PKS protein structure disruption and 

20 consequent loss of catalytic efficiency. 

A method that could utilise the numerous techniques 
available for site directed mutagenesis to influence the 
AT substrate specificity with minimal disruption to the 
protein tertiary structure would be a valuable addition 

25 to the current techniques. 
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Changes to an active site have been shown to alter 
substrate specificity in other systems. For example, in 
an early study, Scrutton et al. (Nature (1990) 343:38-43) 

used site directed mutagenesis to switch the coenzyme 
5 substrate specificity of a glutathione reductase. 

Identifying and changing a 'fingerprint 1 structural motif 
in the NADP+ binding domain they could convert the enzyme 
into one displaying a marked preference for NAD+. The 
techniques of directed evolution have been used to 
10 improve /change enzyme catalytic function. Of many 

examples in the literature, Zhang et al. (PNAS (1997) 

94:4504-4509) illustrate the conversion of a 
galactosidase to a fucosidase by these techniques. The 
resulting protein bears 6 mutations, of which 3 lie in, 

15 or in close proximity to the active site. 

Minor but directed changes to a PKS domain can make 
significant changes to its catalytic function. Patent 
application WO 00/00500 teaches that an extender 
ketosynthase domain is converted to a decarboxylating 

20 (and hence loading) ketosynthase domain by site directed 
mutagenesis at the active site. US Patent numbers 
6,004,787 and 6,066,721 and Jacobsen et al. Science 

(1997)277:367-369 describe the deletion of residues in 
the KS1 active site to inactivate this activity to allow 
25 the production of novel polyketides by feeding of 
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synthetic precursors to the modified PKS. 

Several studies have attempted to correlate the 
primary amino acid sequence of the AT to determine amino 
acids directly involved with the recognition of the 
5 appropriate substrate, and particularly the nature of the 
substrate side chain (i.e. the malonyl portion of the 
acyl-CoA thioester) . Studies by Haydock et al. (FEBS 

Lett. (1995) 374:246-248) correlated the substrate 
specificity of malonyl- or methylmalonyl-CoA specific AT 
10 with a motif 11 amino acids upstream of the known active 
site. Comparisons between this motif and the protein 
structure of a known acyl transferase from E. coli fatty 

acid synthase allowed the authors to assess the proximity 
of the motif residues to the active site (and hence its 
15 ability to select the substrate) . The authors 

acknowledged that "this divergent region thus identified 

lies near the acyltransf erase active site though not 

close enough to make direct contact with the substrate" . 

Other studies (Katz, L. Chem Rev. (1997) 97:2557-2575, 
20 Tang, L. et al., Gene (1998) 216:255-265) have correlated 

additional residues with a specific extender unit using 
these residues as a tool to predict the AT substrate 
specificity from a protein sequence derived from 
polyketide gene cluster sequencing projects. It has 
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remained unclear which residues have mechanistic 
importance. In only one case have regions within the PKS 
AT domain been exchanged in an attempt to swap AT 
specificity; patent application WO 00/01838 and Lau et 

5 al. Biochemistry (1999) 38:1643-51) implicated a 

! hypervariable region 1 at the C- terminus of the AT domain 
in the selection of extender unit. These workers 
interchanged this 25-30 amino acid stretch and showed 
that this change was sufficient to alter the substrate 
10 specificity of the AT, concluding "a short (23-35 amino 

acid) C- terminal segment present in all AT domains is the 

principal determinant of their substrate specificity. 

Interestingly its length and amino acid sequence vary 

considerably among the known AT domains. We therefore 

15 suggest that the choice of extender units by the PKS 

modules is influenced by a "hypervariable region" , which 

could be manipulated via combinatorial mutagenesis to 

generate novel AT domains possessing relaxed or altered 

substrate specificity" . Surprisingly, our structure 

20 molecular modelling studies indicate this region lies at 
a surface accessible region away from the active site and 
hence is unlikely to directly interact with (and hence 
directly select) the malonyl portion or the substrate 
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i 

used. The effect on substrate specificity is therefore 
likely to be imprecise and due to more indirect effects 
via, for example, disruption of tertiary structure. 



5 Disclosure of Invention 

According to a first aspect of the present invention 
there is provided a method of synthesising a compound 
whereof at least a portion is the product of a polyketide 
synthase (PKS) enzyme complex or is derived from such a 

10 product, said PKS enzyme complex including at least one 
acyl transferase (AT) domain. The method includes a step 
of providing said PKS enzyme complex in which said AT 
domain has been altered to change selectively a minor 
proportion of amino acid residues. The altered 

15 residue (s) may comprise one or more motifs which are 
present in the active site pocket of the AT domain and 
which influence the substrate specificity of the AT 
domain, the alteration affecting the substrate 
specificity; and/or one or more residues of a motif which 

20 influences the substrate specificity of the AT domain and 
which comprises a four-residue sequence corresponding to 
the YASH motif of the AT domain of the first module of 
DEBS, the alteration affecting the substrate specificity. 
Synthesis is then effected by means of said PKS enzyme 

25 complex to produce a compound or mixture of compounds 
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different from what could have been produced by means of 
a PKS enzyme in which said AT domain had not been 
altered. 

The PKS enzyme complex may be at least part of a 
5 modular type I PKS enzyme complex, or it may be derived 
from a type II PKS system, a fungal PKS system or a 
hybrid system comprising PKS and nonribosomal peptide 
synthase modules. 

The present invention teaches that by altering a few 

10 amino acid residues in the AT domain and particularly 
residues close to the AT active site comprising one or 
more residues of a short signature "motif" within the AT 
domain it is possible to influence the acylthioester 
selected by that AT domain. Novel polyketides can be 

15 made by a modified PKS on which the signature motif on 
one or more modules is altered, e.g. being replaced with 
one associated with a different specificity for malonyl 
substrate. Furthermore, the invention provides a method 
of reducing the proportion of mixed polyketide products 

20 that are occasionally found in natural systems due to 
non-specific incorporation of the incorrect extender 
units. Conversely, the invention provides a method of 
giving a mixed population of polyketide products thus 
increasing the diversity of polyketides produced by a 

25 PKS . 
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The invention allows the preparation of a modified 
PKS by substitution of an existing amino acid residue 
motif in the AT that specifies incorporation of one of 
the common extender acylthioesters with another motif 
5 found in another AT specifying an alternative 

acylthioester. This alters the substrate specificity of 
the polyketide synthase when it is expressed in a 
polyketide -producing organism. 

The DNA sequences have been disclosed for numerous 

10 Type I PKS gene clusters. Comprehensive sequence 

analysis of AT domains derived from Type I PKS modules 
responsible for the formation of macrolides, particularly 
erythromycin, rapamycin, avermectin, rifamycin, FK506, 
epothilone, tylosin, and niddamycin, ionophore 

15 polyetherS/ particularly monensin, and polyenes, 

particularly nystatin, allowed us to identify amino acids 
that are characteristic of AT domains. 

Figure 2 shows the sequence comparison of these AT 
domains. This sequence comparison has been generated in 

20 a generally conventional way, employing a computer using 
a procedure that creates a multiple sequence alignment 
from a group of related sequences. We used a program 
called PileUp (Wisconsin Package, Genetics Computer Group 
(GCG) , Madison, WI,USA) , which creates a multiple 

25 sequence alignment using simplif ication of the 
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progressive alignment method of Feng and Doolittle 
(Journal of Molecular Evolution 25; 351-360 (1987)). The 
method used is similar to the method described by Higgins 
and Sharp (CABIOS 5; 151-153 (1989)). The program 
5 executes a series of progressive, pairwise alignments 
that allows a large number of sequences to be compared 
together to form a final alignment throughout all the 
sequences. Gaps can be inserted throughout individual 
sequences to allow alignment of regions of strong 

10 similarity. This is often required as strongly conserved 
regions are often separated by more variable regions , 
both in terms of numbers of amino acids and type of amino 
acids. Different programs use different mathematical 
algorithms to make these comparisons, resulting in 

15 alignments that differ in minor ways. However, it can be 
expected that regions of strong homology would still 
align whatever alignment program is utilised. The 

* 

particular motifs that are discussed are marked. 

These motifs include the conserved GQG motif that is 

20 close to the start of the domain, the GHS motif that 

contains the active site serine that covalently binds the 
acyl chain prior to transfer to the ACP, and a LPTY motif 
that is close to the end of the domain. Other residues 
common to all ATs including an arginine, believed to 

25 stabilise the carboxylate group of the acylthioester . 
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Further detailed sequence analysis allowed us to identify 
amino acid residues that differed between ATs responsible 
for the incorporation of malonyl-, methylmalonyl- and 
ethylmalonyl-CoA. Some of these amino acids or motifs 
5 had been previously identified during the sequence 

analysis of the clusters as previously discussed. While 
these motifs could predict whether a malonyl - 
/ methylmalonyl -CoA might be used they generally fail to 
show a difference between methylmalonyl- vs ethylmalonyl- 

10 CoA or the other larger extender unit commonly used. We 
viewed this as an important requirement for 
identification of the most important and key residues 
involved in substrate recognition and consequently 
residues most suitable for alteration. Closer analysis 

15 identified a string of four residues (location identified 
clearly in Figure 2) of which two residues are virtually 
invariant throughout all ATs, and two residues differ 
consistently depending on the extender unit . 
Particularly, in the vast majority of ATs responsible for 

20 recognition of malonyl -CoA the sequence of residues HAFH 
was identified. In the majority of ATs responsible for 
recognition of methylmalonyl -CoA the equivalent segment 
was substituted by residues YASH. In ATs responsible for 
ethylmalonyl-CoA or other similar sized CoA unit 

25 incorporation the overall motif was different, less 
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conserved but generally displayed the sequence XAGH 
(where X is most frequently but not limited to F, T, V or 
H) . We typically use the terms HAFH, YASH and TAGH to 
describe these motifs with respect to malonyl-CoA, 
5 methylmalonyl-CoA and ethylmalonyl/further CoA 
specificity but use these terms herein to allow 
substitutions in the motif, particularly at residue 1 as 
described. Potential substitutions and the exact 
location of the motif will be clear to those skilled in 
10 the art by inspection of Figure 2 or similar sequence 
analysis. 

There are three possible methods to locate the 
position of the motif within an AT sequence that does not 
appear in Figure 2. It is likely a combination of the 
15 methods will be used. 



I) 



By simple visual inspection and comparison of 



the sequence to identify the motifs HAFH, YASH 



or TAGH. Since substi tut ions of residue one 



are often encountered a useful procedure is to 



20 



look for an alanine (A) separated by one amino 



acid (usually F, S or G) from a histidine (H) . 



ID 



By counting amino acids from the active site 



serine. The start of the motif is typically 



(but should not be limited to) between 90 and 
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100 amino acids downstream of the GHS active 
site motif. 

Ill) By computer generated multiple alignment that 

allows the new sequence to be directly compared 
5 to the sequences and motifs we have annotated 

in Figure 2 or to other ATs . 
It is preferable to use the third method as this 
allows the motif to be identified unequivocally when 
there are substitutions within the motif. This is 
10 particularly necessary in some of the more unusual types 
of AT in which one of the residues can be substituted by 
proline (P) . The third method will also identify the 
motif when the number of residues between the motif and 
the AT active site serine differs significantly from the 
15 norm. The third method will also better identify the 
motif when the same or similar string of amino acids 
occurs elsewhere in the domain. 

A particular feature of these motif residues is the 
relationship of the size of the third residue compared to 
20 the substrate selected. Hence, when malonyl-CoA is 

required the third residue is large (phenylalanine) , when 
methylmalonyl-CoA is required this residue is 
intermediate (serine) , and when ethylmalonyl-CoA is 
required this residue is small (glycine) . The inverse 
25 relationship between substrate side chain size and this 
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third residue is particularly noteworthy. Interestingly, 
this relationship applies also when considering the 
incorporation of the more unusual extender units such as 
methoxymalonyl-CoA, required for some cycles of chain 
5 extension during production of for example FK506 (HAGH) . 
Currently, only a single example of an AT responsible 
for the incorporation of a five carbon-CoA unit has been 
disclosed. In this case the AT displays a different 
motif at this point, CPTH, in which only the histidine is 

10 conserved. The incorporation of a proline residue in the 
motif may be indicative of an AT specifying a larger 
substrate. Proline is also found in the motif in ATs that 
incorporate the larger unusual starter acids as seen in 
the case of avermectin and soraphen. Residues in and 

15 around this area, but lying in the active site of the AT 
domain define the specificity of the domain towards the 
substrate chosen. 

Motifs that represent hybrids of motifs for malonyl- 
and methylmalonyl-CoA or methylmalonyl- and ethylmalonyl- 

20 CoA were identified. Particularly, epothilone module 3- 
expected HAFH or YASH (malonyl-CoA or methylmalonyl -CoA 
specific) , found HASH or monensin module 5-expected TAGH 
(ethylmalonyl-CoA specific), found VAGH. Significantly, 
in both these cases the products of the PKS are a mixture 

25 due to the incorporation of 2 different extender units by 
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the module containing the hybrid motif, causing formation 
of monensins A and B and epothilones A and B. However, 
it is known that substrate supply is a significant 
determinant of the proportion of monensins A and B formed 
5 (Liu, H. and Reynolds, K.A (1999) J. Bact. 181:6806- 
6813) . 

Many of the previously-proposed "predictive" motifs 
are unlikely to be the principal determinant of substrate 
specificity because they are not located in the active 

10 site pocket. A particular requirement of any motif that 
can serve to distinguish between substrates is that it 
lies close to the active site and preferably within the 
substrate binding pocket. In this analysis we consider 
the substrate binding pocket to be the part of the pocket 

15 that binds/recognises the malonyl portion of the 

acylthioester rather than necessarily the coenzyme A 
portion. In all probability some of the similarities 
previously identified by sequence analysis are due to 
evolutionary conservation rather than a mechanistic 

20 requirement. In contrast the residues we have identified 
lie in or close to the substrate binding pocket. To 
assess the exact location of the motif in space we 
compared the protein sequence of ATs derived from Type I 
PKS with that of E. coli fatty acid malonyl -CoA:ACP 

25 acyltransf erase, for which there is a high resolution X- 
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(1995) 270:12961-12964). While overall level of sequence 
similarity between these proteins is low, key residues 
(and particularly those with mechanistic importance) are 
5 conserved and the overall spatial arrangement of amino 

acids is expected to be conserved. Many groups have used 
this structure as a model AT and it is well known in the 
art that conservation of structure can be greater than 
the level of sequence conservation. Structural analysis 

10 showed that the identified motif would lie within the 
active site pocket opposite the active site serine and 
the arginine thought to be involved in binding the 
substrate carboxylate and close enough to the 
acyltransferase site to interact with the bound substrate 

15 side chain. The invariant histidine found in the motif 
is thought be part of a catalytic triad with the active 
site serine as is typically found in serine hydrolases 
{Serre et al, Supra) . Figure 3 shows the position of the 

motif loop and important active site residues in the 

20 model AT structure. 

Broadly the invention concerns modifying an AT 
domain by changing the four-residue sequence or motif 
responsible for selecting a substrate so that its 
specificity is altered. We may also change a small 

25 • number of other residues close to the active site. 
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Generally the total number of residues changed is less 
than 5% of the residues of the AT. 

The motif is the four-residue sequence corresponding 
to the YASH motif found at about residues 334-337 of the 

5 AT domain of the first module of DEBS, numbering as shown 
in Fig. 2. It lies in the active site pocket. It 
typically starts 80-110, more particularly 90-100, amino 
acids downstream of the GHS active site motif. 

In a preferred embodiment of this invention 

10 polyketides of desired structure are 'produced by the 
replacement of an existing AT motif on a PKS with an 
alternative one responsible for selection of an 
alternative extender or starter unit, or responsible for 
an altered degree of selectivity (in most cases, 

15 increased selectivity) . This may be carried out in one 
or more of the modules encoding a PKS cluster. One type 
of embodiment is a PKS including two adjoining domains, 
which are "naturally" adjoining or otherwise coupled 
domains, wherein the first of them is an AT domain where 

20 the four-residue motif has been altered to change its 

specificity, the AT domain acting to transfer a substrate 
to the second domain. 

In one class of embodiments, this invention provides 
a PKS multienzyme or part thereof, or nucleic acid 

25 (generally DNA) encoding it, said multienzyme or part 
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comprising a loading module and a plurality of extension 
modules for the generation of a polyketide, preferably 
selected from, macrolides, polyethers, or polyenes, 
wherein the loading or extension modules or at least one 
5 thereof contain a modified AT domain adapted to load and 
transfer an optionally substituted malonyl-CoA residue to 
(preferably) the ACP. The AT domain is preferably 
modified to alter its substrate specificity. This AT 
domain may differ from one naturally found in this 

10 position in the module only by the modification of a few 
amino acids lying in the active site. This modification 
comprises the exchange of all or part of a motif 
particularly but not limited to HAFH with YASH or TAGH or 
vice versa. Optionally, alterations to amino acids 

15 outside this sequence, but preferably lying close to the 
AT active site, are made. 

A second class of embodiments provides a method of 
synthesising polyketides having a desired extension unit 

* 

at any point around the polyketide molecule by providing 
20 a PKS multienzyme incorporating one or more modified AT 
domains and particularly but not limited to an AT domain 
possessing the motif HAFH or YASH or TAGH where these 
motifs replace the existing natural motif. Optionally, 
alterations to amino acids outside this sequence, but 
25 preferably lying close to the AT active site, are made. 
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A third class of embodiments provides a method of 
synthesising polyketides having a desired starter unit by- 
providing a PKS multienzyme incorporating a modified AT 
domain in the loading module and particularly (but not 
5 limited to) an AT domain possessing the motif HAFH or 
YASH or TAGH or a motif incorporating a proline residue 
where these motifs replace the existing natural motif. 
Optionally, alterations to amino acids outside this 
sequence , but preferably lying close to the AT active 

10 site, are made. Preferentially, this AT will follow a 

KSQ domain but other loading systems are known in the art 
(e.g. AT-ACP) . Patent application WO 00/00500 describes 
some of the known loading systems. The modification of 
the loading module can be combined with similar 

15 modifications in other extension units. 

A further class of embodiments provides a method of 
synthesising polyketides free of natural co-produced 
analogues and having a desired extender or loading unit 
by replacing an existing hybrid or alternative protein 

20 motif with the sequences HAFH, YASH or TAGH. It is 
particularly useful to make this alteration in the 
epothilone or monensin PKS gene cluster. 

In still further aspects this invention provides a 
method of synthesising a mixed population of polyketides 

25 by providing a PKS multienzyme incorporating an AT with a 
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altered or hybrid motif, particularly, but not limited to 
HASH or VAGH. One particular utility of this method, 
though not limited to this utility, is the production of 
combinatorial libraries of compounds. 
5 In a further aspect the PKS containing a modified AT 

domain may be spliced to a hybrid PKS produced for 
example as in WO 98/01546 and WO 98/01571 or WO 00/01827 
or WO 00/00500. It is particularly useful to link such a 
modified PKS to gene assemblies that produce novel 

10 derivatives of natural polyketides, for example 14- 
membered macrolides. 

Each of these aspects and classes of embodiment may 
involve providing nucleic acid encoding the polyketide 
synthase multienzyme and introducing it into a organism 

15 where it can be expressed. Suitable plasmids and host 
cells are described below. The polyketide synthase so 
produced or portions thereof may be isolated from the 
host cells by routine methods, though it is usually 
preferable not to do so. The host cells may also be 

20 capable of producing the required acylthioester, eg. by 
producing ethylmalonyl CoA for example. It may be 
advantageous to remove the PKS from a strain with a 
particularly strong supply of an undesired acylthioester 
or express the altered PKS in a strain specifically 

25 chosen to have a strong supply of a particular 
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growth conditions to enhance expression of the desired 
product. Conversely, such techniques could be used to 
promote formation of mixtures of products if. so desired. 

5 It may also be beneficial to supply chemical precursors 
to the desired acylthioesters in the media e.g. supply 
diethylethylmalonate or cyclobutane carboxylic acid etc. 

The host cells may also be capable of modifying the 
initial PKS products, e.g. by carrying out all or some of 

10 the biosynthetic modifications normal in the production 
of erythromycin (as shown in figure 4) and for other 
polyketides . Use may be made of mutant organisms such 
that some or all of the normal pathways are blocked, e.g. 
to produce products without one or more "natural" hydroxy 

15 groups or methyl groups or sugar groups. 

The invention should not be limited to the exact 
motifs described. We have described some of the known 
variations within the motif, particularly at residue 1 as 
can be determined by inspection of Figure 2 or by 

20 inspection of similar sequence data. However other 
modifications can be envisaged; substitution of, for 
example, the phenylalanine in the malonyl-CoA motif by 
the similar sized tyrosine may still display the same 
selectivity. Similarly substitution of the small residue 

25 glycine found in the motif responsible for ethylmalonyl- 
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CoA/other extender incorporation by for example but not 
limited to alanine. It is well known to those skilled in ' 
the art that these and other similar conservative 
substitutions frequently maintain the same selectivity. 

5 Similarly the serine residue found in the motif for 

incorporation of methylmalonyl-CoA could be substituted 
by a residue intermediate in size and/or displaying a 
similar charge distribution. 

The invention should not be limited to changes only 

10 in this motif. Alterations to other residues around the 
AT domain may also be required to increase the level of 
specificity or catalytic efficiency, i.e. to increase the 
proportion or amounts of the desired products. These 
residues are preferentially close to the substrate 

15 binding pocket. The requirement for these additional 
alterations will depend on the particular context or 
change desired. Particular residues to alter can be 
readily identified by inspection of Figure 2 or other 
similar sequence analysis data or alternatively by 

20 analysis of the structural model . 

Residues that may be altered in addition to the 
motif can be divided into two classes. Some of these 
residues may have been previously identified in the 
motifs used to predict the specificity of a motif (ie. 

25 Haydock et al. (FEBS Lett. (1995) 374:246-248). These 
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residues are preferentially close to the substrate- 
binding pocket. These residues should not be limited to 
the particular examples described. 

I) The first class of potential residues to change 
5 includes residues close to the motif on the polypeptide 

chain. A particular example is the residue immediately 
after the 4 residue motif described in the present 
invention. In malonyl-CoA specific ATs this residue is 
generally serine (S) , i.e. the protein sequence at this 

10 point is generally HAFHS, whereas in methylmalonyl-CoA 
specific ATs this residue can be S but can also be T, G, 
or C for example. Thus to change a methylmalonyl-CoA 
specific AT to a malonyl-CoA specific AT by changing the 
signature motif it may be beneficial also to ensure that 

15 the residue immediately after the motif is an S. Since 
this residue is close to the motif on the polypeptide 
chain it lies close to the substrate binding pocket. 

II) The second class includes residues that are 
close to the motif or active site in space. These 

20 residues are best identified by reference to the model AT 
structure described previously or another AT structure 
that may be subsequently derived. It is known to those 
skilled in the art that it is possible to thread related 
protein sequences into an existing structure by using 

25 structure molecular modelling or related techniques. 
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Alternatively, an acylthioester may be modelled into the 
active site. These are the preferred methods, but often- 
simple inspection of the existing structure using the 
highly conserved motifs as a reference point gives a 
5 reasonable approximation. 

A particular example of a residue close in space to 
the motif that might be changed is the residue 
immediately after the GHS active site motif. In 
methylmalonyl-CoA specific ATs this residue is generally 
glutamine (Q) , i.e. the protein sequence at this point is 
GHSQ, whereas in malonyl-CoA specific ATs this residue is 
often V, I or L for example. Thus to change a malonyl- 
CoA specific AT to a methylmalonyl-CoA specific AT by 
changing the signature motif it may be beneficial also to 
ensure that the residue immediately after the GHS motif 
is a Q, Since this residue is close to the active site 
serine it lies within the substrate-binding pocket. 

A further example of a residue close in space that 
might be altered is the residue lying three residues 
downstream of the GQG motif. In methylmalonyl-CoA 
specific ATs this residue is generally tryptophan (W) , 
i.e. the protein sequence at this point is GQGXXW, 
whereas in malonyl-CoA specific ATs this residue is often 
R, H or T for example. Thus to change a malonyl-CoA 
specific AT to a methylmalonyl-CoA specific AT by 
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changing the signature motif it may be beneficial also to 
ensure that this particular residue after the GQG motif 
is a W. Analysis of the model AT structure shows that 
the GQG motif lies close to the active site pocket and 
5 consequently so does this tryptophan, 

A further example of a residue close in space that 
might be altered is the residue 4 residues downstream 
from the conserved arginine referred to above, which is 
believed to stabilise the carboxylate group of the 

10 acylthioester substrate. In malonyl-CoA specific ATs 

this residue downstream of the R is generally methionine 
(M) , i.e. the protein sequence at this point is RXXXMQ. 
In methylmalonyl-CoA specific ATs this residue is 
generally I or L, and in ethylmalonyl-CoA specific ATs it 

15 is often W. Thus, for example, to change a 

methylmalonyl-CoA specific AT to a malonyl-CoA specific 
AT by changing the signature motif it may be beneficial 
also to ensure that this particular residue is a 
methionine. Analysis of the model AT structure shows 

» 

20 that this residue lies close to the active site pocket. 

In further aspects the present invention provides 
vectors, such as plasmids or phages (preferably 

r ' 

plasmids) , including nucleic acids as defined in the 
above aspects and host cells particularly 
25 Saccharopolyspora or Streptomyces species transformed 
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with such nucleic acids or constructs. It will be 
readily apparent to those skilled in the art that there 
are multiple molecular biological methods for achieving 
the desired alterations to the AT domain, particularly at 
5 the nucleic acid level, e.g. techniques of site directed 
mutagenesis or directed evolution. Suitable plasmid 
vectors and genetically engineered cells suitable for 
expression of PKS genes with modules incorporating an 
altered AT domain can readily be designed or selected by 
10 those skilled in the art. They include those described 

in WO 98/01546 as being suitable for expression of hybrid 
PKS genes of Type I . Examples of effective hosts are 
Saccharopolyspora erythraea, Streptomyces coelicolor, 

Streptomyces avermitilis, Streptomyces griseofuscus, 

15 Streptomyces cinnamonensis , Streptomyces fradiae, 

Streptomyces longisporoflavus, Streptomyces 

hygros copious, Micromonospora griseorubida, Streptomyces 

lasaliensis, Streptomyces venezuelae, Streptomyces 

antibioticus, Streptomyces lividans, Streptomyces 

20 rimosus, Streptomyces albus, Amycolatopsis mediterranei, 

and Streptomyces tsukubaensis. These include hosts in 

which SCP2*-derived plasmids are known to replicate 
autonomously, such as for example S. coelicolor, S. 
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avermitllis and S. griseofuscus; and other hosts such as 

Saccharopolyspora erythraea in which SCP2* -derived 

plasmids become integrated into the chromosome through 
homologous recombination between sequences on the plasmid 

5 insert and on the chromosome; and all such vectors which 
are integratively transformed by suicide plasmid vectors. 
A plasmid with an int sequence will integrate into a 
specific attachment site on the host's chromosome. 

It is apparent to those skilled in the art that the 

10 overall sequence similarity between nucleic acids 
encoding comparable AT domains from Type I PKSs is 
sufficiently high and the domain organisation of 
different Type I PKSs so consistent between different 

■ 

polyketide-producing organisms, that the processes for 
15 obtaining novel hybrid polyketides described will be 

generally applicable to all natural modular Type I PKSs 
or their derivatives. 

The present invention will now be illustrated, but 
is not intended to be limited, by means of some examples. 
20 Amino acids have been defined throughout by their 

standard one letter codes as follows. A-alanine, R- 
arginine, N-asparagine, D-aspartic acid, C-cysteine, Q- 
glutamine, E-glutamic acid, G-glycine, H-histidine, I- 
isoleucine, L-leucine, K-lysine, M-methionine, F- 
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phenylalanine, P-proline, S-serine, T-threonine, W- 
tryptophan, Y- tyrosine and V- valine. 

Brief Description of Drawings 
5 Figure 1 is a diagram showing the functioning of 6- 

deoxyerythronolide B synthase (DEBS) , a modular PKS 
producing 6-deoxyerythronolide B, a precursor of 
erythromycin A. 

Figure 2 gives the amino acid sequence comparison of 

10 the AT domains of representative Type I PKS gene 

clusters. The motifs GQG, GHS and LPTY are marked at the 
base of the figure along with the arginine and the motif 
defined in the invention as defining specificity. The 
abbreviations used at the side to define the PKS used 

15 are: ave: avermectin, debs: erythromycin, epo: 

epothilone, sor: soraphen, fkb: FK506, rap: rapamycin, 
tyl: tylosin, mon: monensin, nid: niddamycin, nys: 
nystatin, rif : rifamycin. The numbers represent the 
module number. The letter a at the end of the 

20 designation indicates malonyl-CoA specific AT, the letter 
p indicates methylmalonyl-CoA specific AT, and the letter 
b indicates ethylmalonyl-CoA specific AT. Further types 
of AT with unusual or ill -defined AT specificity are 
indicated with letter x. Due to the numbers of sequences 

25 considered, in the pileup each section of 50 amino acids 
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spreads over two pages. The sequences of the monensin 
ATs are unpublished. They are set out in PCT/GB00/02072 . 

Figure 3 shows a three-dimensional representation of 
the active site of the E. coli acyltransf erase . The 

5 spatial arrangement of the motifs described in the text 
are shown by arrows and the atoms shown in bold. 

Figure 4 shows the enzymatic steps that convert 6- 
deoxyerythronolide B into erythromycin A in 
Saccharopolyspora erythraea. 

10 Figure 5 shows the DNA sequence from the monensin 

PKS encoding the loading AT used in Example 8 . 

Modes for Carrying Out the Invention 

15 Example 1 

Construction of plasmid pHP41 

Plasmid pHP41 is a pCJR24 -based plasmid containing 
the DEBS1 PKS gene comprising a loading module, the first 
and second extension modules of DEBS and the chain 
20 terminating thioesterase. The motif YASH of the AT 

domain of first module has been altered to HAFH. Plasmid 
pHP41 was constructed by several intermediate plasmids as 
follows. Plasmid pDlAT2 (Oliynyk, M. et al. Chem. Biol. 

(1996) 3:833-839) was digested with Ndel and Xbal . A 
25 -llkbp fragment was isolated by gel electrophoresis and 
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the DNA purified from the gel- This fragment was ligated 
into pCJR24 (Rowe, C.J. et al. Gene (1998) 216:215-223) 

that had been linearised by digestion with Ndel and Xbal 

and treated with alkaline phosphatase. The ligation 
5 mixture was used to transform electrocompetent E. coli 

DH10B cells and individual clones checked for the desired » 
plasmid pCJR26. Plasmid pCJR26 was identified by 
restriction pattern. pCJR26 was transformed into E. coli 

strain ET12567 (McNeil, D.J. et al . Gene (1992) 111:61- 
10 68) and an individual colony grown overnight to isolate 

demethylated DNA. This DNA was linearised using MscI and 

Avrll and the ~13kb fragment (Fragment A) isolated by gel 

electrophoresis and purification from the gel. 

A DNA segment of the eryAI gene (start nucleotide 
15 45368, end nucleotide 34734) from S.erythraea extending 

from nucleotide 42104 to nucleotide 41542 was amplified 
by PCR using the following oligonucleotide primers; 5'- 
TTTTTTTGGCCAGGGTTGGCAGTGGGCGGGCA- 3 f and 5 f - 
TTTTTACGGCCAGCCGCTTGGCGCGGAT-3 1 . The DNA from a plasmid 

20 designated pCJR65 derived from pCJR24 and DEBS1TE was 

used as a template. The design of the primers introduced 
a MscI site at nucleotide 42105 and the second primed 

across a BstXI site at position 41546. The 574bp PCR 
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product was treated with T4 polynucleotide kinase and 
ligated to plasmid pUC18 that had been linearised by 
digestion with Smal and then treated with alkaline 

phosphatase. The ligation mixture was used to transform 
5 electrocompetent E. coli DH10B and individual clones 

checked for the presence of the desired plasmid pHP39. 
Plasmid pHP39 was identified by restriction pattern and 
sequence analysis. Demethylated DNA was produced by 
transforming E. coli strain ET12567 with plasmid DNA. 

10 The resulting DNA was linearised by digestion with MscI 

and BstXI and the resulting 552bp fragment (Fragment B) 

isolated by gel electrophoresis and purified from the 
gel. A DNA segment of the eryAI gene from S.erythraea 

extending from nucleotide 41557 to nucleotide 4112 0 was 
15 amplified by PCR using the following oligonucleotide 
primers; 5 ' -CGGTGCCTAGGTGCACCGACTCCCAGTCC-3 5' - 
TTTTTCCAAGCGGCTGGCCGTGGACCACGCGTTCCACTCCTCGCACGTCGAGACGAT 

-3 1 . DNA from plasmid pCJR65 was used as a template. 
The design of the primers introduced an Avrll site at 

20 nucleotide 41125 and the second primed across a BstXI 

site at nucleotide 41557 and mutated the amino acid 
sequence YASH to HAFH (encoded by nucleotides 41537- 
41526) . The 442bp PCR product was treated with T4 
polynucleotide kinase and ligated to plasmid pUC18 that 
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had been linearised by digestion with Smal and then 

treated with alkaline phosphatase. The ligation mixture 
was used to transform electrocompetent E. coli DH10B and 

individual clones checked for the presence of the 
5 desired plasmid pHP40. Plasmid pHP40 was identified by 
restriction pattern and sequence analysis. Plasmid pHP40 
was linearised by digestion with restriction enzymes 
Avrll and BstXI, and a 427bp fragment (Fragment C) 

isolated by gel electrophoresis and purified from the 
10 gel. Fragments A, B, and C were ligated together and the 
resulting ligation mixture used to transform 
electrocompetent E. coli DH10B. Individual clones were 

checked for the presence of an insert derived from DEBS1. 
The resulting plasmid was designated pHP41. Sequence 
15 analysis was used to confirm the clone contained the 
correct motif HAFH. 

Example 2 

Construction of S. erythraea NRRL2338 JC2/pHP41 and 

20 production of trike tides 

S. erythraea NRRL2338 JC2 contains a deletion of the 

eryAI, eryAII and eryAIII apart from the TE (Rowe, C.J. 
et al. Gene 216, 215-223). Plasmid pHP41 was used to 

transform S. erythraea NRRL2338 JC2 protoplasts using the 
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TE as a homology region. Thiostrepton resistant colonies 
were selected on R2T20 agar containing 40 fig/ml 
thiostrepton. S. erythraea NRRL2338 JC2 (pHP41) was 

plated onto SM3 agar (see patent application WO 00/01827) 
5 containing 40 ng/ml thiostrepton and allowed to grow for 

11 days at 30°C. Approximately 1cm 2 of the agar was 
homogenised and extracted with a mixture of 1.2ml ethyl 
acetate and 20 \il formic acid. The solvent was decanted 

and removed by evaporation and the residue dissolved in 
10 methanol and analysed by GC/MS. The major products were 
identified by comparison with authentic standards 
(Oliynyk, M. et al. Chem. Biol. (1996) 3:833-839) as 

triketide lactones (2S,3R,5R) -2-methyl-3, 5-dihydroxy-n- 
hexanoic 5-lactone (AAP, i.e. Acetate, Acetate, 
15 Propionate incorporation) , (2S, 3R, 5R) -2-methyl-3, 5- 

dihydroxy-n-heptanoic 5-lactone (PAP), (2R, 3S, 4S, 5R) 2, 

4-dimethyl-3 , 5 -dihydroxy-n-heptanoic 5-lactone (PPP) and 

(2R,3S, 4S,5R) 2, 4-dimethyl-3 , 5-dihydroxy-n-hexanoic 5- 
lactone (APP) . These products were identified as their 
20 ammonium adducts corresponding to exact mass 144, 158, 

172 and 158. Four products were produced because in this 
strain, and under the conditions of the experiment the 
loading module loads both acetate and propionate and the 
modified AT loads malonyl-CoA and methylmalonyl-CoA. 
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Only three triketide lactone peaks could be observed in 
the GC/MS spectra under standard conditions, this was due 
to the co-elution of the equivalent mass APP and PAP 
compounds. An isocratic gradient was used to verify this 
5 peak was comprised of two components. In further sets of 
experiments 3. erythraea JC2 (pHP41) was used to 

inoculate 5ml TSB containing 5 ng/ml thiostrepton. After 
three days growth 1.5ml of this culture was used to 
inoculate 25ml SM3 media containing 5 ng/ml thiostrepton 

10 in a 250ml flask. The flask was incubated at 3 0 °C, 

250rpm for 6 days. At this time the supernatant was 
adjusted to pH3.0 with formic acid and extracted twice 
with an equal volume of ethyl acetate. The solvent was 
removed by evaporation and the residue analysed by GC/MS. 
15 In each experiment we could identify the 4 prpducts AAP, 
PAP, PPP and APP but the absolute ratios and quantities 
were variable, presumably depending on exact media and 
growth conditions within each flask (figure 6) . 

20 Example 3 

Construction of S. erythraea NRRL2338 (pHP41) and 

its use to produce 12-desmethyl erythromycin B. 

Plasmid pHP41 was used to transform S. erythraea 

NRRL2338 protoplasts. Thiostrepton resistant colonies 
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were selected on R2T20 agar containing 40 |ig/ml 

thiostrepton . Several clones were tested for the 
presence of pHP41 integrated into the chromosome by 
Southern blot hybridisation of their genomic DNA with DIG 
5 labelled vector DNA. A clone with a correctly integrated 
copy of pHP41 was identified in this way. S. erytiuraea 

NRRL2338 (pHP41) was used to inoculate 5ml TSB containing 
5 [ig/ml thiostrepton. After three days growth 1.5ml of 

» 

this culture was used to inoculate 25ml EryP media (see 
10 patent application WO 00/00500) containing 5 Jig/ml 

thiostrepton in a 250ml flask. The flask was incubated 
at 30 °C, 250rpm for 6 days. At this time the supernatant 

was adjusted to pH9.0 with ammonia and extracted twice 
with an equal volume of ethyl acetate. The solvent was 

15 removed by evaporation and the residue analysed by 
HPLC/MS. A peak of molecular mass m/z (M+H) =704 was 
observed required for C-12 desmethyl erythromycin B in 
addition to a peak corresponding to erythromycin A 
(M+H) =734. Other peaks corresponding to partially 

20 processed erythromycin intermediates could be identified. 



Example 4 

Construction of plasmid pHP048 
Plasmid pHP048 is a pCJR24 -based plasmid containing the 

■ 

25 DEBS1 PKS gene comprising a loading module, the first and 
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second extension modules of DEBS1 and the chain 
terminating thioesterase . The motif YASH of the AT 
domain of first module has been altered to HASH. Plasmid 
pHP048 was constructed by several intermediate plasmids 
5 as follows. 

A DNA segment of the eryAI gene from S.erythraea 

extending from nucleotide 41557 to nucleotide 41120 was 
amplified by PCR using the following oligonucleotide 
primers; 5 1 -CGGTGCCTAGGTGCACCGACTCCCAGTCC-3 1 and 5'- 
10 TTTTTCCAAGCGGCTGGCCGTGGACCACGCGTCGCACTCCTCGCACGTCGAGACGAT 
-3 ! . The DNA from plasmid pCJR65 was used a as template. 
The design of the primers introduced a Avrll site at 

nucleotide 41125 and the second extended to a BstXI site 

at nucleotide 41557, also mutated the amino acid sequence 
15 YASH (encoded by nucleotides 41537-41526) to HASH. The 
442bp PCR product was treated with T4 polynucleotide 
kinase and ligated to plasmid pUC18 that had been 
linearised by digestion with Smal and then treated with 

alkaline phosphatase. The ligation mixture was used to 
20 transform electrocompetent E. coli DH10B and individual 

clones checked for the presence of the desired plasmid 
pHP022. Plasmid pHP022 was identified by restriction 
pattern and sequence analysis. Plasmid pHP022 was 
linearised by digestion with restriction enzymes Avrll 
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and BstXI, and the fragment (Fragment D) isolated by gel 

electrophoresis and purified from the gel. Fragment D 
was ligated with Fragments A and B described previously 
and the resulting ligation mixture used to transform 
5 electrocompetent E. coli DH10B. Individual clones were 

checked for the presence of an insert derived from DEBS1. 

The resulting plasmid was designated pHP048 . Sequence 
analysis was used to confirm the clone contained the 
correct motif HASH. 

10 

Example 5 

Construction of S. erythraea NRRL2338 JC2 (pHP048) 
and its use to produce triketides 

S. erythraea NRRL2338 JC2 contains a deletion of the 

15 eryAI, eryAII and eryAIII apart from the TE (Rowe, C.J. 
et al. Gene 216, 215-223) . Plasmid pHP048 was used to 

transform S. erythraea NRRL2338 JC2 protoplasts using the 
TE as a homology region. Thiostrepton resistant colonies 
were selected on R2T20 agar containing 40[xg/ml 
20 thiostrepton. S. erythraea JC2 (pHP048) was used to 

inoculate 5ml TSB containing 5 ug/ml thiostrepton. After 
three days growth 1.5ml of this culture was used to 
inoculate 25ml SM3 media containing 5 \xg/ml thiostrepton 
in a 250ml flask. The flask was incubated at 30 °C, 
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250rpm for 6 days. At this time the supernatant was 
adjusted to pH3.0 with formic acid and extracted twice 
with an equal volume of ethyl acetate. The solvent was 
removed by evaporation and the residue analysed by GC/MS. 
5 A mixture of products were identified as their ammonium 
adducts corresponding to the AAP, PAP, APP and PPP 
triketide lactones as described in example 2. In this 
example, under the media/growth conditions described the 
PKS with the HASH change is more catalytically active 
10 than the HAFH change (example 2) as judged by total 

amounts of triketide lactone produced, however in this 
case the modified PKS appears to display lower 
selectivity towards acetate as judged by the ratio of AAP 
to PPP triketide lactone. 

15 

Example 6 

Construction of plasmid pHP47 

Plasmid pHP47 is a pCJR24 -based plasmid containing 
20 the DEBS1 PKS gene comprising a loading module, the first 
and second extension modules of DEB SI and the chain 
terminating thioesterase . The motif YASH of the AT 
domain of first module has been altered to VAGH. Plasmid 
pHP47 was constructed by several intermediate plasmids as 
25 follows. 
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A DNA segment of the eryAI gene from S.erythraea 

extending from nucleotide 41557 to nucleotide 41120 was 
amplified by PCR using the following oligonucleotide 

primers; S'-CGGTGCCTAGGTGCACCGACTCCCAGTCC-S' and 5 f - 
5 TTTTTCCAAGCGGCTGGCCGTGGACGTCGCGGGGCACTCCTCGCACGTCGAGACGAT 
-3\ The DNA from plasmid pCJR65 was used as a template. 
The design of the primers introduced a Avrll site at 

■ 

nucleotide 41125 and the second extended to a BstXI site 

at nucleotide 41557, also mutated the amino acid sequence 
10 YASH (encoded by nucleotides 41537-41526) to VAGH. The 

* 

442bp PCR product was treated with T4 polynucleotide 
kinase and ligated to plasmid pUC18 that had been 
linearised by digestion with fimal and then treated with 

alkaline phosphatase. The ligation mixture was used to 
15 transform electrocompetent E. coli DH10B and individual 

clones checked for the presence of the desired plasmid 
pHP46. Plasmid pHP46 was identified by restriction 
pattern and sequence analysis. Plasmid pHP46 was 
linearised by digestion with restriction enzymes Avrll 

20 and BstXI , and the fragment (Fragment E) isolated by gel 

electrophoresis and purified from the gel. Fragment E 
was ligated with Fragments A and B described previously 
and the resulting ligation mixture used to transform 
electrocompetent E. coli DH10B. Individual clones were 
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checked for the presence of an insert derived from DEBS1 . 

The resulting plasmid was designated pHP47. Sequence 
analysis was used to confirm the clone contained the 
correct motif VAGH. 

5 

Example 7 

Construction of plasmid pLSQ07 

Plasmid pLS007 contains the crotonyl-CoA reductase 
(CCR) gene from S. cinnamonensis that is believed to 

10 influence the level of ethylmalonyl-CoA within the cell. 

Plasmid pSG142 (Gaisser et al. Mol - Microbiol. (2000) 36 

391-401) places genes under the control of the actl 
promoter and can be used to integrate either in the right 
hand side of the erythromycin gene cluster or in the act 
15 promoter region of a previously transformed actinomycete . 

Two oligonucleotide primers; 5 f - 
GGCAAACATATGAAGGAAATCCTGGACGCG- 3 1 and 5'- 

TCCGCGGATCCTCAGTGCGTTCAGATCAGTGC- 3' were used to amplify 
the S. cinnamonensis CCR gene using genomic DNA as 

20 template. The design of the primers incorporated Ndel 

and BamHI restriction sites to facilitate cloning. The 

1.4kb PGR product was isolated by gel electrophoresis and 
purified from the gel and ligated with pSG142 that had 
been digested with Ndel and Bglll. The resulting 
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ligation mixture was used to transform electrocompetent 
E. coli DH10B cells. Plasmid pLS003 was identified by 

restriction analysis and sequencing to ensure errors were 
not introduced during amplification. A discrepancy with 
5 the published sequence was identified. However, further 
analysis by comparison with other published CCR protein 
sequences indicated pLS003 was correct. Plasmid pLS003 
was digested with Ndel and Xbal and the resulting 4.5kb 

fragment (fragment F) isolated by gel electrophoresis and 
10 purified from the gel. This fragment was ligated to 

pLSB2 a derivative of pKC1132 containing the actl/actll 
promoter region behind an Ndel site. Plasmid pLSB2 was 

digested with Ndel and Xbal and the resulting ~4kb 

fragment (Fragment G) purified by gel electrophoresis and 
15 purified from the gel. Fragments F and G were ligated 
together and the resulting ligation mixture was used to 
transform electrocompetent E. coli DH10B cells. Plasmid 

pLS007 was identified by restriction analysis. 



20 Example 8 

Construction of S. eryfchraea NRRL2338 JC2 

(pHP47/pLS007) and its use to produce triketides 

S. erythraea NRRL2338 JC2 contains a deletion of the 

eryAI, eryAII and eryAIII apart from the TE (Rowe, C.J. 
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et al. Gene 216, 215-223) . Plasmid pHP47 was used to 

transform S. erythraea NRRL2338 JC2 protoplasts using the 

TE as a homology region. Thiostrepton resistant colonies 

were selected on R2T20 agar containing 40 jig/ml 

5 thiostrepton. PLS007 was used to transform protoplasts 
of S. ezythraea NRRL2338 JC2 (pHP47) , thiostrepton and 

apramycin resistant clones were selected on R2T20 agar 
containing 40 ng/ml thiostrepton and 50 pg/ml apramycin 
plus lOmM magnesium chloride and the resistance markers 
10 verified by plating on tapwater media containing the same 
antibiotics. S. erythraea NRRL2338 JC2 (pHP47/pLS007) was 

used to inoculate 5ml TSB containing 5 ug/ml thiostrepton 
and 50 yg/ml apramycin. After three days growth 1.5ml of 
this culture was used to inoculate 25ml SM3 media 

15 containing 5 jig/ml thiostrepton and 50 ug/ml apramycin in 
a 250ml flask. The flask was incubated at 30°C, 250rpm 
for 6 days. At this time the supernatant was adjusted to 
pH3 . 0 with formic acid and extracted twice with an equal 
volume of ethyl acetate. The solvent was removed by 

20 evaporation and the residue analysed by GC/MS. In this 

experiment amounts of triketide product were lower but a 
mixture of products could be identified as their ammonium 
adducts corresponding to exact masses 158 172 and 186. 
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Example 9 

Construction of S. ervthraea NRRL2338 (pHP47) and 

its use to produce erythromycins. 

Plasmid pHP47 was used to transform S. erythraea 

5 NRRL2338 protoplasts. Thiostrepton resistant colonies 
were selected on R2T20 agar containing 40 ug/ml 
thiostrepton. S. erythraea NRRL2338 (pHP47) was used to 

inoculate 5ml TSB containing 5 jig/ml thiostrepton. After 
three days growth 1.5ml of this culture was used to 
10 inoculate 25ml EryP media containing 5 ng/ml thiostrepton 

in a 250ml flask. The flask was incubated at 30°C, 250rpm 

for 6 days. At this time the supernatant was adjusted to 
pH9.0 with ammonia and extracted twice with an equal 
volume of ethyl acetate. The solvent was removed by 
15 evaporation and the residue analysed by HPLC/MS. Peaks 
of mass m/z (M+H) =734 corresponding to erythromycin A 
were observed. 

Example 10 

20 Construction of plasmid pSGK051 

Plasmid pSGK051 is a pPFL43 based plasmid (WO 
00/00500) . The motif HAFH of the AT domain of the 
loading domain has been altered to YASH. Plasmid pSGKOSl 
was constructed by several intermediate plasmids as 
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follows. 

Plasmid pPFL43 was linearised by digestion with 
restriction enzymes Ncol and NotI and a 858bp fragment 

(Fragment Q) isolated by gel electrophoresis and purified 
5 from the gel. 

A DNA segment of the monensin loading domain from 
nucleotide 16360-17366 (see figure 5 and PCT/GBOO/02072) 
was amplified by PCR using the following oligonucleotide 

primers ; 5 1 - 

10 GGGGACGCGGCCGCAAGGCCCACCACCTGAAGGTCAGCTACGCCTCCCACTCCCCGC 
ACATGGACCCCAT - 3 1 and 5 ' -GGCTAGCGGGTCCTCGTCCGTGCCGAGGTCA- 
3 1 • The design of the primers amplified across a NotI 

site at nucleotide 16367 and changed the amino acid 
sequence HAFH to YASH at nucleotides 16398-16409, the 
15 second introduced a Nhel site equivalent to that in 

pPFL43. The DNA from plasmid pPFL43 was used as a 
template. The 1006bp PCR product was treated with T4 
polynucleotide kinase and ligated to plasmid pUC18 that 
had been linearised by digestion with Smal and treated 

20 with alkaline phosphatase. The ligation mixture was used 
to transform electrocompetent E. coli DH10B and 

individual clones checked for the presence of the desired 
plasmid pCSAT9. Plasmid pCSAT9 was identified by 
restriction pattern and sequence analysis. Plasmid 
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pCSAT9 was linearised by digestion with restriction 
enzymes JVbtl and Nhel and a 995bp fragment (Fragment R) 

isolated by gel electrophoresis and purified from the 
gel. Plasmid pPFL43 was digested with Ncol and Miel to 

5 remove a 1.8kb fragment and the larger fragment (Fragment 
S) isolated by gel electrophoresis and purified from the 
gel. Fragments Q, R and S were ligated together and the 
resulting ligation mixture used to transform 
electrocompetent E. coli DH10B. Individual clones were 

10 checked for the desired plasmid pSGKOSl. The resulting 
plasmid was analysed by restriction digest and sequenced 
to confirm the presence of the correct motif YASH. 

Example 11 

15 Construction of S. erythraea NRRL2338 JC2/pSGK051 

and production of triketides 

Plasmid pSGKOSl was used to transform S. erythraea 

NRRL2338 JC2 protoplasts using the TE as a homology 
region. Thiostrepton resistant colonies were selected on 

20 R2T20 agar containing 40 jig/ml thiostrepton. S. 

erythraea NRRL2338 JC2 (pSGK051) was plated onto R2T20 
agar containing 40 ng/ml thiostrepton and allowed to grow 
for 11 days at 30°C. Approximately 1cm 2 of the agar was 

■ 

homogenised and extracted with a mixture of 1.2ml ethyl 
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acetate and 20 \il formic acid. The solvent was decanted 
and removed by evaporation and the residue dissolved in 
methanol and analysed by GC/MS. The major products were 
identified by comparison with authentic standards as 
5 triketide lactones (2S f 3R, 4S,5R) -2 , 4 -dimethyl- 3 , 5- 

dihydroxy-n-heptanoic 8-lactone and (2S, 3R,4S,5R) -2,4- 

dimethyl-3 , 5-dihydroxy-n-hexanoic 5-lactone . 
Example 12 

10 Construction of S. erythraea NRRL2338 (pSGK051) and 

its use to produce erythromycins, 

Plasmid pSGK051 was used to transform S. erythraea 

NRRL2338 protoplasts. Thiostrepton resistant colonies 
were selected on R2T20 agar containing 40 ng/ml 
15 thiostrepton. S. erythraea NRRL2338 (pSGK051) was plated 

onto R2T20 agar containing 40 ng/ml thiostrepton and 
allowed to grow for 10 days at 30°C. Approximately 2cm 2 
of the agar was homogenised and extracted with a mixture 
of 1.2ml ethyl acetate and 20 jxl dilute ammonia. The 

20 solvent decanted and was removed by evaporation and the 

residue analysed by HPLC/MS. Peaks of mass m/z (M+H) =734 
and 720 could be observed alongside likely products of 
incomplete processing. Comparison to authentic standards 
proved the compounds produced were erythromycin A and 13- 
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methyl erythromycin A. 
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CLAIMS: 

1. A method of synthesising a compound whereof at 
least a portion is the product of a polyketide synthase 
5 (PKS) enzyme complex or is derived from such a product, 
said PKS enzyme complex including at least one 
acyltransferase (AT) domain; said method comprising the 
steps of (i) providing said PKS enzyme complex in which 
said AT domain has been altered to change selectively a 

10 minor proportion of amino acid residues, the altered 

residue (s) comprising one or more residues of one or more 
motifs which are present in the active site pocket of the 
AT domain and which influence the substrate specificity 
of the AT domain, the alteration affecting the substrate 

15 specificity; and (ii) effecting synthesis by means of 

said PKS enzyme complex to produce a compound or mixture 
of compounds different from what could have been produced 
by means of a PKS enzyme iri which said AT domain had not 
been altered. 

20 2. A method according to claim 1 wherein said 

motif comprises a four-residue sequence corresponding to 
the YASH motif of the AT domain of the first module of 
DEBS • 

3. A method of synthesising a compound whereof at 
25 least a portion is the product of a polyketide synthase 
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(PKS) enzyme complex or is derived from such a product, 
said PKS enzyme complex including at least one 
acyltransferase (AT) domain; said method comprising the 
steps of (i) providing said PKS enzyme complex in which 

4 

5 said AT domain has been altered to change selectively a 
minor proportion of amino acid residues, the altered 
residue (s) comprising one or more residues of a motif 
which influences the substrate specificity of the AT 
domain and which comprises a four-residue sequence 

10 corresponding to the YASH motif of the AT domain of the 
first module of DEBS, the alteration affecting the 
substrate specificity; and (ii) effecting synthesis by 
means of said PKS enzyme complex to produce a compound or 
mixture of compounds different from what could have been 

15 produced by means of a PKS enzyme in which said AT domain 
had not been altered. 

4. A method according to claims 1, 2 or 3 wherein 
said motif was located by a) determining the sequence of 
the AT domain and b) performing sequence alignment with a 

20 plurality of sequences of other AT domains. 

5. A method according to any preceding claim 
wherein the PKS enzyme complex is at least part of a 
modular type I PKS enzyme complex. 
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6. A method according to any preceding claim 
wherein said alteration of the AT domain affects less 
than 5% of the residues. 

7 . A method according to any preceding claim 
5 wherein said alteration alters a motif selected from 

XAFH, XASH, and XAGH and/or creates such a motif. 

8 . A method according to claim 7 wherein the motif 
is XAGH and X is selected from F, T, V and H. 

9. A method according to claim 7 wherein the motif 
10 is XAFH and X is H. 

10. A method according to claim 7 wherein the motif 
is XASH and X is selected from Y,H,W and V. 

11. A method according to any of claims 1-10 
wherein said alteration produces or alters a motif 

15 containing proline. 

12. A method according to any preceding claim 
wherein in addition to the alteration to one or more 
residues of said mot if (s), one or more additional 
residues in or adjacent the substrate binding pocket have 

20 been altered. 

13. A method according to claim 12 wherein said 
additional altered residue (s) comprise one or more of a) 
the residue immediately downstream of the motif, b) the 
residue three residues downstream from the GQG motif, c) 

25 the residue immediately downstream of the GHS motif, and 
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d) the residue four residues downstream of the conserved 
arginine residue. 

14. A method according to any preceding claim 
wherein the alteration produces a motif specific for 

5 malonyl-CoA and the motif is followed by S which was 
produced by alteration if not already present. 

15. A method according to any of claims 1-13 
wherein the alteration produces a motif specific for 
methylmalonyl-CoA and the motif is followed by S f G, C or 

10 T which was produced by alteration if not already 
present . 

16. A method according to any of claims 1-13 
wherein the alteration produces a motif specific for 
methylmalonyl-CoA, and the residue following the GHS 

15 motif in the active site is Q which was produced by 
alteration if not already present. 

17. A method according to any of claims 1-13 
wherein the alteration produces a motif specific for 
malonyl-CoA, and the residue following the GHS motif in 

20 the active site is V, I or L which was produced by 
alteration if not already present. 

18. A method according to any of claims 1-13 
wherein the alteration produces a motif specific for 
methylmalonyl-CoA, and the residue 3 residues downstream 
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of the GQG motif is W which was produced by alteration if 
not already present, 

19. A method according to any of claims 1-13 
wherein the alteration produces a motif specific for 

5 malonyl-CoA, and the residue 3 residues downstream of the 
GQG motif is R, H or T which was produced by alteration 
if not already present. 

20. A method according to any of claims 1-13 
wherein the alteration produces a motif specific for 

10 malonyl-CoA and the residue 4 residues downstream of the 
conserved R as. found as residue 252 in the first module 
of DEBS is M which was produced by alteration if not 
already present. 

21. A method according to any of claims 1-13 
15 wherein the alteration produces a motif specific for 

methylmalonyl-CoA and the residue 4 residues downstream 
of the conserved R as found as residue 252 in the first 
module of DEBS is I or L which was produced by alteration 
if not already present. 

20 22, A method according to any of claims 1-13 

wherein the alteration produces a motif specific for 
ethylmalonyl-CoA and the residue 4 residues downstream of 
the conserved R as found as residue 252 in the first 
module of DEBS is W which was produced by alteration if 

25 not already present. 
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23- A method according to any preceding claim 
wherein the AT domain has an active site with a GHS 
motif, and said motif which is altered starts 80-110 

■ 

residues downstream of said GHS motif. 
5 24. A method according to any preceding claim 

wherein said step (i) of providing said PKS enzyme 
complex comprises providing a nucleic acid sequence 
encoding said complex and effecting expression thereof. 

25. A method according to claim 24 wherein 
10 expression is effected is an organism capable of 

producing polyketides. 

26. A method according to claim 24 or claim 25 
wherein said nucleic acid sequence has been subjected to 
site directed mutagenesis so that it encodes said altered 

15 AT domain. 

27. A method according to claim 24, 25 or 26 
wherein the AT domain prior to alteration is naturally 
expressed in a first organism and the altered AT is 
expressed in a second organism which is better able than 

20 the first organism to supply a substrate for which the 

alteration has increased specificity and/or which is less 
well able than the first organism to supply a substrate 
for which the alteration has reduced specificity. 

28. A method according to any preceding claim 
25 wherein said PKS includes said AT domain and a second 
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domain which is naturally coupled thereto prior to the 
alteration thereof to receive a substrate transferred to 
it by the AT; and the alteration causes the AT to act to 
transfer a different substrate to the second domain. 
5 29. A method according to any preceding claim 

wherein said PKS includes said AT domain and its natural 
cognate ACP domain which, prior to the alteration, is 
adapted to receive a substrate transferred to it by the 
AT; and the alteration causes the AT to act to transfer a 
10 different substrate to said cognate ACP domain. 

30. A method according to any preceding claim 
wherein said PKS including the altered AT domain is 
spliced to a hybrid PKS. 

31. A polyketide compound or derivative thereof or 
15 compound whereof a portion is a polyketide or derivative 

thereof, which compound is obtainable by a method 
according to any preceding claim wherein the compound 
differs from a compound resulting from synthesis effected 
by means of said PKS enzyme complex without the 
20 alteration of said AT domain. 

32. Nucleic acid encoding a PKS enzyme complex 
including an altered AT domain as defined in any of 
claims 1-30. 

33. A vector including a nucleic acid according to 
25 claim 32 . 
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34. A host organism containing nucleic acid 
according to claim 32 and able to express the PKS enzyme 
complex. 

35. A host organism according to claim 34 which is 
5 adapted to synthesise a compound whereof at least a 

■ 

portion is a polyketide resulting from the action of the 
PKS enzyme complex. 

36. A method of synthesising a polyketide synthase 

(PKS) enzyme complex, said PKS enzyme complex including 

i 

10 at least one acyltransferase (AT) domain; said method 

comprising altering said AT domain to change selectively 
a minor proportion of amino acid residues, the altered 
residue (s) comprising one or more residues of one or more 
motifs which are present in the active site pocket of the 

15 AT domain and which influence the substrate specificity 
of the AT domain, the alteration affecting the substrate 
specificity. 

37. A method according to claim 36 wherein said 
motif comprises a four-residue sequence corresponding to 

20 the YASH motif of the AT domain of the first module of 
DEBS. 

38. A method of synthesising a polyketide synthase 
(PKS) enzyme complex, said PKS enzyme complex including 
at least one acyltransferase (AT) domain; said method 

25 comprising altering said AT domain to change selectively 



WO 02/14482 



PCT/GB01/03642 



a minor proportion of amino acid residues, the altered 
residue (s) comprising one or more residues of a motif 
which influences the substrate specificity of the AT 
domain and which comprises a four-residue sequence 
5 corresponding to the YASH motif of the AT domain of the 
first module of DEBS, the alteration affecting the 
substrate specificity. 

39. A PKS enzyme complex as produced by the method 
of claims 36, 37 or 38. 
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attyl03a 



TSLVISGTPH TVQHITTLCQ QQGIKTKTL. PTNHAFHSPH TNPILNQLH. 
TSLVISGTPH TVQHITTLCQ QQGIKTKTL. PTNHAFHSPH TNPILNQLH. 
TSLVISGTPH TVQHITTLCQ QQGIKTKTL. PTKNAFHSPH TNPILNQLH. 
TSLVISGTPH TVQHITTLCQ QQGIKTKTL. PTNHAFHSPH TNPILNQLH. 
TSLVISGTPH TVQHITTLCQ QQGIKTKTL. PTNHAFHSPH TNPILNQLH. 

SSWLSGDEA AVLQAAEGLG KWTRL. PTSHAFHSAR MEPMLEEFR. 

SSWLSGDEA AVLQAAEGLG KWTRL. ATSHAFHSAR MEPMLEEFR. 

SSWLSGDEA AVLQAAEGLG KWTRL. ATSHAFHSAR MEPMLEEFR. 

SSWLSGDEA AVLQAAEGLG KWTRL. ATSHAFHSAR MEPMLEEFR. 

SSWLSGDET AVLQAAAALG KSTRL. ATSHAFHSAR MEPMLEEFR. 

SSWLSGDEA AVLQAAEGLG KWTRL. ATSHAFHSAR MEPMLEEFR. 

ASIVLSGDED AVLDVAARLG RFTRL. RTSHAFHSAR MEPMLDEFR. 

HSWLSGDEG PVLDVAQQLG IHHRL. PTRHAGHSAR MDPLVAPLL. MeOmalonyl-CoA 

HSWLSGDED AVLDVAQRLG IHHRL. PAPHAGHSAH MEPVAAELL. MeOmalonyl-C©A 

THCVLSGPRT ALEETAQQLH QQGIRHTWL. KVSHAFHSAL MDPMLGAFR. 
THCVLSGPRT ALEETAQHLR EQNVRHTWL. KVSHAFHSAL MDPMLGAFR. 
THCVLSGPRT ALEETAQHLR EQNVRHTWL. KVSHAFHSAL MDPMLGAFR. 
THCVLSGPRT ALEETAQHLR EQNVRHTWL. KVSHAFHSAL MDPMLGAFR. 
SAWLTGAPD DVAAFEREWA AAGRRAKRL. DVGHAFHSRH VDGALDDFR. 
EAVWSGEPE PVADFEAAWT ASGREARKL. KVRHAFHSRH VEAVLDEFR. 
DSTVISGPSD EVDRIAGVWR ERGRKTKAL. SVSHAFHSAL MEPMLAEFT. 
DSTVISGPSG EVDRIAGVWR ERGRKTKAL. SVSHAFHSAL MEPMLAEFT. 
DSTVISGPSG EVDRIAGVWR ERGRKTKAL. SVSHAFHSAL MEPMLGEFT. 
EQWIAGVEQ AVQAIAAGFA ARGARTKRL. HVSHAFHSPL MEPMLEEFG. 
EQWIAGVEQ AVQAIAAGFA ARGARTKRL. HVSHASHSPL MEPMLEEFG. Mal/mraal 
EQWIAGAEK FVQQIAAAFA ARGARTKPL. HVSHAFHSPL MDPMLEAFR. 
DQWIAGAGQ PVHAIAAAMA ARGARTKAL. HVSHAFHSPL MAPMLEAFG. 
DAWIAGAEV QVLALGATFA ARGIRTKRL. AVSHAFHSPL MDPMLEDFQ. 
DSLVLSGDEQ AWSAAGELA ARGRRTKRL. SVSHAFHSPH MDAMLADFR. 
RSWISGAEE AVAEAAAQLA GRGRRTRRL. RVAHAFHSPL MDGMLAGFR. 
LSTWAGDED AWEIARQAE ALGRKTTRL. RVSHAFHSPH MDGMLDDFR. 
LSTWAGDED AVLKIARQVE ALGRKATRL. RVSHAFHSPH MDGMLDDFR. 
SAWLSGAEA TVTALAEQLA ADGRKTRRL. RVSHAFHSPL MEPMLDAFR. 
RSVWAGVEE DVLLLADLFA ADGRRTKRL. RVSHAFHSPL MDAMLDDFA. 
VSVWSGVEA AVGQWDQLV ERGRRVRRL . AVSHAFHSPL MDPMLDAFR. 
VSVWSGVEA AVGQWDQLV ERGRRVRRL. AVSHAFHSPL MDPMLDAFR. 
TSVWAGTEE AVAAIGARFT AQDRKTTRL. RVSHAFHSPL MDPMLAEFR. 
TSLWSGDET ATLAVAARLA EQGRRTTRL. RVSHAFHSPL MDPMLAEFR. 
NALWSGVED AAVEIGARFA AEGRRTTRL. HVSHAFHSPL MDPMLAEFR. 
TSWISGAEE ATQTVAQHFA DQGRRTTAL. RVSHAFHSPL MDPMLAEFR. 
TSWVSGAES AARTVADRLA ENGRKTTRL . RVSHAFHSPL MDPMLAEFR. 
TSWISGAEE ATQTVAQHFA DQGRRTTAL. RVSHAFHSPL M. .MLAEFR. 
TSVWSGYEN ATLAVARHFA DQGRRTTRL. RVSHAFHSPL MAPMLDDFR. 
TAWLSGAGD AVTALGQALA ERGHRTTRL. RVSHAFHSHL MDPMLADFR. 
TAVWAGAED AVRQLTARFA DRGRRTSRL. AVSHAFHSPL MEPMLDAFR. 
QSWISGDEE AAETIAATFA ERGRKTKRL. RVSHAFHSPR MDGMLDAFR. 
RSVWAGAED AALAVRRHFD DLGRRTTRL. PVSHAFHSPL MDPMLDAFR. 
RSVWAGAED AVRAVADRLA ADGRRTRRL. TVSHAFHSPL MDPMLTDFA. 
RSIVLSGDED AVLDLAQQWA ARGRRTRRL. RTSHAFHSPH MDAMLGDFR. 
DSWLSGTEA AVLAVADELA GRGRKTRRL. AVSHAFHSPL MEPMLDDFR. 
TAWVSGEAA AVGEVEKALR GRGLKTKRL. NVSHAFHSPL IEPMLDDFR. 
ASWFSGAED EVGNMADWFA ERGRRVRRL . RTGHAFHSPL MDPMLEEFQ. 
SAWLSGDAD AWAAAARMR ERGHKTKQL . KVSHAFHSAR MAPMLAEFA. 
DSVWSGDRA TVDELTAAWR GRGRKAHHL. RVSHAFHSPH MDPILDELR. 
SSVWSGDEA AVLELLEQWR AEGREARRL. AVSHAFHSPR MDGMLTQFD. 



HAFH/YASH/TAGH motif 
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ataveOOx 
atdebsOOp 
atepo06p 
atepo07p 
atepoOlp 
atepoOSp 
atsoralx 
atfkbOlp 
atfkb09p 
atrap03p 
atrap06p 
atrap04p 
atrapl3p 
atrapOlp 
atrap07p 
atraplOp 
atfkb04x 
attyl04p 
attyl06p 
attylOlp 
attyl02p 
attylOOp 
atnid05b 
attylOSb 
atnid06x 
atdebsOlp 
atmon02p 
atmonlOp 
atmon04p 
atmon07p 
atmonllp 
atmonl2p 
atmon05b 
atmonOlp 
atdebs02p 
atdebs06p 
ataveOlp 
atave07p 
atave06p 
atave09p 
atnysOlp 
atnysllp 
atrifOSp 
atrif07p 
atrif08p 
atrif lOp 
atrif03p 
atrif 06p 
atrif04p 
atrif Olp 
atnys02p 
atfkb02p 
atavellp 
atdebs03p 
atnid04p 
atdebs05p 
atdebs04p 



351 

SGLLPITPRP 
SALAWFAPGG 
AALGAIRPRA 
AALGAIRPRA 
AALGGLRPGA 
AALGELEPRQ 
DALREVRPNK 
DALEGITSST 
DVLGDITSSA 
DITSDSSSQA 
DITSDSSSQA 
GITAGIGSQP 
DITSDSSSQT 
GVIAGVDSRA 
DITAGIGSQA 
DITSDSSSQD 
RIVAATTSRA 
RVLSGIRPRS 
RVLSGIRPRS 
RVLSGIRPRS 
RVLSGIRPRS 
RVLSGIRPRS 
EVLAPVAPRP 
EVLAPVSPRS 
DLLSGVRPAP 
ELGEDFHPLP 
ERLADIRPTN 
ERLADIRPAN 
EGLADIRPAN 
DRLADIRPAT 
DRLADIQPTT 
ERLADIRPTT 
EGLAGIRPAA 
HTLSGVRPTT 
QALAGITPRR 
ADLDGISARR 
ELLGDISPQP 
ELLGDISPQP 
HLLGDITPQP 
HLLGDITPQP 
EVLAELAPRT 
EVLAELAPRT 
ETLAGIDAQA 
ETLAGITAQA 
EALAGIEAHA 
EALAGIDARA 
KTLAGIDARV 
ETLAGIEAQA 
EMLGGIRAQA 
ETLAGISAQA 
RVLAPVDPRA 
DALADLTPGA 
ELLAPIRART 
ETTGDIAPRP 
AVLAGLRPRA 
TELAGISPVS 
AELGTITAVR 



SRIPFHSSVT G GRL. . DTRELDAAY 

SEVPFYASLT G GAV. . DTRELVADY 

AAVPMRSTVT G GVI. .AGPELGASY 

AAVPMRSTVT G GVI. .AGPELGASY 

AAVPMRSTVT G AMV. .AGPELGANY 

ATVSMRSTVT S TIM. .AGPELVASY 

AQIPIVSEVT G TAL. . DGERFDASH 

PSVPWWSTVD S GWV . . . TEPFGDAY 

PSVPWWSTVD G GWV . . . TEPAGDDY 

PLVPWLSTVD G SWV . . . DSPLDGEY 

PWPWLSTVD G .SWV. . . DSPLDVEY 

PWPWLSTVD G SWV . . . DS PLDGEY 

PLVPWLSTVD G-. TWV. . . DSPLDGEY 

PWPWLSTVD G TWV. . . EGPLDAEY 

PWPWLSTVD G TWV . . . EGPLDVEY 

PLVPWLSTVD G TWV. . . DSPLDGEY 

PEI PWFSTAD E RWI . . . DAPLDDE Y 

PRVPVCSTVA G E..Q PGEPVFDAGY 

PRVPVCSTVA G E..Q PGEPVFDAGY 

PRVPVCS TVA G E . . Q PGE PVFDAGY 

PRVPVCSTVA G E..Q PGEPVFDAGY 

PRVPVCSTVA G E..Q PGEPVFDAGY 

ADI PFYSTVT G.J... GLL . . DGTELDAT Y 

SDIPFYSTVT G APL. . DTERLDAGY 

SRIPFFSTVT A GPC. . GGDQLDGAY 

GFVPFFSTVT G RWT. . QPDELDAGY 

TDVAFYSTVT A ERL. TDTTALDTDY 

TDVAFYSTVT A ERL. TDTTALDTDY 

TDVAFYSTVT A ERL. TDTTALDTDY 

TDVAFYSTVT A ERL. TDTTALDTDY 

TDVAFYSTVT A ERL. DDTTALDTAY 

TDVAFYSTVT A ERL. DDTTTLDTDY 

TDVAFYSTVT A GHL. TDTTELDTAY 

APVAFYSAVT G TRI . . DTAGLDTDY 

AEVPFFSTLT G D . . F LDGTELDAGY 

AAIPLYSTLH G E. -R RD. . .MGPRY 

SGVPFFSTVE G TW LDTTTLDAAY 

SGVPFFSTVE G TW LDTTTLDAAY 

STVPFFSTVE G TW LDTTTLDAAY 

STMPFFSTW G HLVW Y.TTTLDAAY 

SEVPFFSTVT G DWL. . DTARMDAGY 

SEVPFFSTVT G DWL. .DTARMDAGY 

PWPFYSTVA G EWI. TDAGWDGGY 

PDVPFRSTVT G GWV. RDADVLDGGY 

PTLPFFSTLT G DWI . REAGWDGGY 

PLVPFLSTLT G EWI. RDEGWDGGY 

PAIPFYSTVL G TWI . EQA.WDAGY 

PKVPFYSTLI G DWI. RDAGIVDGGY 

PEVPFYSTVT G GWV. EDAGVLDGGY 

PAVPFYSTVT S EWV. RDAGVLDGGY 

PEVPFYSTVT G DRV. DDAA. FDGAY 

PEIPFFSTVD E AWL. DRP A . . DAAY 

GDVPFYSTVT G ERI. . DGTELDADY 

ARVTFHSTVE S RSM. . DGTELDARY 

GRVPFYSTVE A EPL. . DGTALDAGY 

ADVALYSTTT G QPI . . DTATMDTAY 

GS VPLHSTVT G EVI. . DTSAMDASY 

Fig 2o 
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WYRNMSSTVR 

WRRSFRLPVR 

WADNLRQPVR 

WADNLRQPVR 

WMNNLRQPVR 

WADNVRQPVR 

WVRNFGDPAL 

WYRNLRQPVA 

WYRNLRQPVA 

WYRNLREPVG 

WYRNLREPVG 

WYRNLREPVG 

WYRNLREPVG 

WYRNLREPVG 

WYRNLREPVG 

WYRNLREPVG 

WFRNMRNPVG 

WFRNLRNRVE 

WFRNLRNRVE 

WFRNLRNRVE 

WFRNLRNRVE 

WFRNLRNRVE 

WYRNMREPVE 

WYRNMREPVE 

WYRNTREPVE 

WYRNLRRTVR 

WVTNLRQPVR 

WVTNLRQPVR 

WVTNLRQPVR 

WVTNLRQPVR 

WVTNLRQPVR 

WVTNLRQPVR 

WVRNVRRTVR 

WVTNLRRPVR 

WYRNLRHPVE 

WYDNLRSQVR 

WYRNLHQPVR 

WYRNLHQPVR 

WYRNLHQPVR 

WYRNLHQPVR 

WFRNLRGRVR 

WFRNLRGRVR 

WYRNLRNQVG 

WYRNLRNQVR 

WYRNLRNQVG 

WYRNLRGRVR 

WYRNLRQQVR 

WYRNLRNQVG 

WYRNLRRQVR 

WYRNLRNQVR 

WYTNLRQTVR 

WYDNVRCPVR 

WYRNLRQWR 

WYRNLRETVR 

WYRNLRQRVR 

WYANLREQVR 

WYRNLRRPVL 
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atave02a 

ataveOSa 

atave04a 

atave08a 

atave03a 

atrap02a 

atraplla 

atrap08a 

atrapl2a 

atrapOSa 

atrap09a 

atfkb03a 

atfkb07x 

atfkb08x 

atnidOla 

atnid03a 

atnid02a 

atnidOOa 

atfkblOa 

atrapl4a 

atmon06a 

atmon08a 

atmon09a 

atepo02a 

atepo03x 

atepo08a 

atepoOOa 

atepo04a 

atnid07a 

attyl07a 

atsor02a 

atsorbla 

atnys09a 

atnysl2a 

atnysl6a 

atnysl7a 

atnys03a 

atnyslSa 

atnys07a 

atnys08a 

atnysOSa 

atnys06a 

atnys04a 

atnysl4a 

atnysOOa 

atnyslOa 

atnysl8a 

atnysl3a 

atavelOa 

atrif02a 

atmon03a 

atavel2a 

atrif09a 

atmonOOa 

atty!03a 



QHTQTLTYHP 

QHTQTLTYHP 

QHTQTLTYHP 

QHTQTLTYHP 

QHTQTLTYHP 

AVAEGLTYRT 

AVAEGLTYRT 

AVAEGLTYRT 

AVAEGLTYRT 

TVAERLTYQT 

AVAQGLTYHA 

DVAERLTYHE 

EAASGLTYHQ 

ATTRELRYDR 

DTLNTLNYQP 

DTLNTLNYQP 

DTLNTLNYQP 

DTLNTLNYQP 

GVLESLAFGA 

TALESLKFRA 

EAIRGVKFRQ 

EAIREVKFTR 

EAIRGVKFRQ 

RVAASVTYRR 

RVAASVTYRR 

RVTESVTYRR 

RVAESVSYRR 

RVAATIAYRA 

AVAESVTYRT 

EVAAGLRYRE 

RVAQSLTYHP 

RVAQGLTFHP 

AWEDLTLQP 

AVARGLTYHP 

AVAEGLEYHQ 

AVAEGLEYHQ 

AVAAGLTYHE 

AVAEGLSYGE 

WAEGLSYAA 

AVAEGLSYAT 

AVAEGLSYAT 

AVAEGLSYAT 

AWESLTFTA 

TVAEGLEYHP 

DWSRLTFHQ 

IVAEGLTYRA 

TALAPLTFAE 

RVAEGLTYHE 

RAAEQVTFSA 

AVAERLTYRA 

EVARGLTFHA 

QVAASLTYSE 

AELAGVTWRE 

AVAAGLTFHE 

RVARTLTFAP 
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PHTPLITANT 

PHTPLITANT 

PHTPLITANT 

PHTPLITANT 

PHTPLITANT 

PQVA MA 

PQVS MA 

PQVS MA 

PQVS MA 

PRLA MA 

PGW MA 

PKLP MA 

PHT A 

PHT - .* A 

PTIPLISNLT GQIADPNHL. 
PTIPLISNLT GQIADPNHL. 
PTIPLISNLT GQIADPNHL. 
PTIPLISNLT GQIADPNHL. 
ARLPWSTTT GRDAAGD.LA 
PALPWSTVT GRLIDQDEMG 

PSIPLMSNVS GERA 

PKVSLISNVS GLEA 

PSIPLMSNVS GERA 

PSVSLVSNLS GKWT . DEL - 
PSVSLVSNLS GKWA. DEL. 
PSIALVSNLS GKPCT. DEV. 
PSIVLVSNLS GKACT.DEV. 
PDRPWSNVT GHVAG.PEI - 
PRLPIVSEVT GRPAAPSEL. 
PELTWSTVT GRPARPGEL. 
ARIPIISNVT GARATDHEL. 
ARIPIISNVT GARATDQEL. 
PLLPWSNLT GKPATVAQL . 
PTIPFVSNVS GGLATAEQV. 
PRIPWSNVT GEVAAAEEL . 
PRIPWSNVT GEVAAAEEL. 
PRIPVLSNLT GTVAAVADL. 
PQIPWSNLT GAVADGTLL. 
PSLPWSNLT GQVATADEL. 
PSLPWSNLT GWLATADEL. 
PTLPWSNLT GRLATADDL. 
PTLPWSNLT GQVATADEL. 
PTTPWSNLT GELAPAEAL. 
PRIPWSNLT GDVADAADL. 
PSIPLVSNLT GELA.GSEI. 
PRIPLVSDLT GRRADDAEV. 
PEIPWSNLT GLPATAEEL. 
PRIPLVSTLL GAPAGA.EL. 
PRIPWSNVT GAPLPAETM . 
GSLPWSTLT GELAA...L. 
PTLPWSNLT GRLADAELM. 
PAIPMVSTLT GDIVAAGEL. 
PEIPWSNVT GRFAEPGEL . 
PVIPWSNVT GELVTATATG 
PTIPLVSTLT GTPVTEETL. 



PPDQLLTPHY 
PPDQLLTPHY 
PPDQLLTPHY 
PPDQLLTPHY 
PPDQLLTPHY 
AGDQVMTAEY 
VGDQVTTAEY 
AGDQLTTTEY 
VGDQVTTAEY 
AGDRVTTAEY 
AGDRVMTAEY 
AGADCATPEY 
IPEDPTTAAY 
IPNDPTTAEY 

CTPDY 

CTPDY 

CTPDY 

CTPDY 

TPEH 

TPEY 

. GEEITDPEY 
.GEEIASPEY 
.GEEITSPEY 

SAPGY 

SAPGY 

SAPGY 

SSPGY 

ATPEY 

MDPGY 

TGPDY 

. ... . .ASPDY 

AS PET 

TSADY 

RTPDY 

CAADY 

CAADY 

CSADY 

GTADY 

CSAEY 

CSAEY 

CSAEY 

CSAEY 

CSADY 

CSADY 

TSAEY 

CTAEY 

AT PHY 

RTPDY 

CTPDY 

DSPDY 

ADAEY 

SDPEY 

TEPGY 

SGAGQADPEY 
CTADH 



WTQQARNTVD 
WTQQARNTVD 
WTQQARNTVD 
WTQQARNTVD 
WTQQARNTVD 
WVRQVRDTVR 
WVRQVRDTVR 
WVRQVRDTVR 
WVRQVRDTVR 
WVRQVRDTVR 
WVRQVRDTVR 
WVRQVRDTVR 
WARQVRDQVR 
WAEQVRNPVL 
WIDHARHTVR 
WIDHARHTVR 
WIDHARHTVR 
WIDHARHTVR 
WLRHARRPVL 
WLRQVRRPVR 
WARHVRNAVL 
WARHVRQTVL 
WARHVRQTVL 
WVRHVREAVR 
WVRHVREAVR 
WVRHAREAVR 
WVRHAREWR 
WVRHVRSAVR 
WTRQIREPVR 
WVAQVREPVR 
WVRHVRHTVR 
WVRHVRDTVR 
WVDHVRHAVR 
WVGHVRAAVR 
WVRHVRATVR 
WVRHVRATVR 
WVRHVREAVR 
WVRHVREAVR 
WVRHVREAVR 
WVRHVREAVR 
WARHVREAVR 
WVRHVREAVR 
WVRHVREAVR 
WVRHVRGTVR 
WVRHVRDTVR 
WVRHVREAVR 
WVCHVRQAVR 
WVRHVRETVR 
WVEHARSTVR 
WVGQVRNAVR 
WVRHVRRPVR 
WVRQVRRTVR 
WAEHVRRPVR 
WARHAREPVR 
WVRQAREPVR 
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ataveOOx FEPAARLLLQ QGP . KTFVEM SPHPVLTMGL QELAPDLG DTTG 

atdebsOOp FDEAIRSALE VGP.GTFVEA SPHPVLAAAL QQTL DAEG 

atepo06p FAAAAQALLE GGP.ALFIEM SPHPILVPPL DEIQTA AE 

atepo07p FAAAAQALLE GGP.ALFIEM SPHPILVPPL DEIQTA AE 

atepoOlp FAEWQAQLQ GGH.GLFVEM SPHPILTTSV EEMRRA AQ 

atepo05p FAEAVQSLME DGH . GLFVEM SPHPILTTSV EEIRRA TK 

atsoralx FSTAIDHLLQ EGF.DIFLEL TPHPLALPAI ESNLRR SG 



atfkbOlp MDTAVSELDG SLFIEC SAHPVLLPAL DQ 

atfkb09p MDTAIGELDG SLFIEC SAHPVLLPAL DQ 

atrap03p FHPAVGQLQA QGD.TVFVEV SASPVLLQAM DD 

atrap06p FHPAVGQLQA EGD.TVFVEV SASPVLLQAM DD 

atrap04p FHPAVSQLQA QGD.AVFVEV SASPVLLQAM DD 

atrapl3p FHPAVSQLQA QGD.TVFVEV SASPVLLQAM DD 

atrapOlp FEPAAGQLQA QGD.TVFVEV SASPVLLQAM DD 

atrap07p FDSAVGQLRA EGD.TVFVEV SASPVLLQAM DD 

atraplOp FHPAVSQLQA QGD.TVFVEV SASPVLMQAM DD, 

atfkb04x FAAAVAAARE PGD.TVFIEV SAHPVLLPAI NG 

atty!04p FSAWGGLLE EGH.RRFIEV SAHPVLVHAI EQT A I , .EAAD 



atty!06p FSAWGGLLE EGH.RRFIEV SAHPVLVHAI EQT A EAAD 

attylOlp FSAWGGLLE EGH.RRFIEV SAHPVLVHAI EQT A EAAD 

atty!02p FSAWGGLLE QGH.RRFIEV SAHPVLVHAI EQT A EAAD 

attylOOp FSAWGGLLE EGH.RRFIEV SAHPVLVHAI EQT A EAAD 

atnidOSb FERATRALIA DGH . DVFLET SPHPMLAVAL EQT V TDAG 

attylOSb FEKAVRALIA DGY.DLFLEC NPHPMLAMSL DET L TDSG 

atnid06x FDATVRALLR AGH.HTFIEV GPHPLLNAAI DEI A ADEG 

atdebsOlp FADAVRALAE QGY.RTFLEV SAHPILTAAI EEI G DGSG 

atmon02p FADTIEALLA DGY.RLFIEA SAHPVLGLGM EETIEQ AD 

atmonlOp FADTIEALLA DGY.RLFIEA SAHPVLGLGM EETIEQ AD 

atmon04p FADTIEALLA DGY.RLFIEA SAHPVLGLGM EETIEQ AD 

atmon07p FADTIDALLA DGY.RLFIEA SAHPVLGLGM EETIEQ AD 

atmonllp FADTIEALLA DGY.RLFIEA SPHPVLNLGI QETIEQQA GAA 

atmonl2p FADTIEALLA DGY.RLFIEA SPHPVLNLGM EETIER AD 

atmonOSb FADTIDALLA DGY.RLFIEV SPHPVLNLAL EGLIER AA 

atmonOlp FADAVTALLA DGH . RVFIEA SSHPVLTLGL QETFEE AG 

atdebs02p FHSAVQALTD QGY.ATFIEV SPHPVLASSV QETL DDAE 

atdebs06p FDEAVSAQSP DGH.ATFVEM SPHPVLTAAV QE IA 



ataveOlp FSDAVQALAD DGH.RVFVEV SPHPTLVPAI EDTTEDTA ED.. 

atave07p FSDAVQALAD DGH.RVFVEV SPHPTLVPAI EDTTEDTA ED. . 

atave06p FSHAIQTLTD DGH.RAFIEI SPHPTLVPAI EDTTENTT EN.. 

atave09p FSHAIQTLTD DGH . RPFIEI SPHPTLVPAI EDTTENTT EN.. 

atnysOlp FADAVADLLA AEY.RAFVEV SSHPVLTMAV LD LI EEAG 

atnysllp FADAVADLLA AEY.RAFVEV SSHPVLSMAV QE AI DEAG 

atrif05p FGPAVAELIE QGH.GVFVEV SAHPVLVQPI SE LT D. . . 

atrif07p FGPAVAELLE QGH.GVFVEV SAHPVLVQPI SE LT D. . . 

atrif08p FGPAVAELLG LGH.RVFVEV SAHPVLVQAI SA IA DD. . 

atriflOp FGPAVEALLA QGH.GVFVEL SAHPVLVQPI TE LT DE. . 

atrif03p FGPSVADLAG LGH.TVFVEI SAHPVLVQPL SE....IS DD. . 

atrif06p FGPAVAELVR QGH.GVFVEV SAHPVLVQPL SE LS DD. . 

atrif04p FGPAVAELIE QGH.RVFVEV SAHPVLVQPI NE LV DD. . 

atrifOlp FGAAATALLE QGH . TVFVEV SAHPVTVQPL SE LT GD. . 



atnys02p MEEATRALLA AGH.RVFIEV SPHPVLAAPI QETQEAVA EATG 

atfkb02p FGAAAARLAE LGH . RVFVEA SPHPVLTTAL ADTLAG H 

atavellp FRDATQALVR AGH.TVFIEA CPHPAVAVGV QETLDE.M GD 

atdebs03p FADAVTRLAE SGY.DAFIEV SPHPVWQAV EEAVEE.A DGAE 

atnid04p FESALRAMLA DGV. DAFVEC SPHPVLTVPV RQTLED.A " GA. 

atdebs05p FQDATRQLAE AGF.DAFVEV SPHPVLTVGI EATLDS.A LPAD 

atdebs04p FEQAVRGLVE QGF.DTFVEV SPHPVLLMAV EET A EHAG 
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atave02a YATTTQTLHQ HG. VTTYIEL GPDNTLTTLT HHNLPNPPTT TLTLTHPHHH 

ataveOSa YATTTQTLHQ HG. VTTYIEL GPDNTLTTLT HHNLPNTPTT TLTLTHPHHH 

atave04a YATTTQTLHQ HG . VTTYIEL GPDNTLTTLT HHNLPNTPTT TLTLTHPHHH 

atave08a IATTTQTLHQ HG . VTTYIEL GPDNTLTTLT HHNLPNTPTT TLTLTHPHHH 

atave03a YATTTQTLHQ HG. VTTYIEL GPDNTLTTLT HDNLPNTPTT TLTLTHPHHH 

atrap02a FGEQVASFED A VFVEL GADRSLARLV DG 

atraplla FGEQVASYED A VFVEL GADRSLARLV DG 

. atrap08a FGEQVASYED A VFVEL GADRSLARLV DG 

atrap!2a FGEQVASYED A VFVEL GADRSLARLV DG 

atrapOSa FGEQVASYED A VFIEL GADRSLARLV DG 

atrap09a FGEQVASYED A VFVEL GADRSLARLV DG 

atfkb03a FAEQVAAYDG A ALLEI GPDRNLARLV DG 

atfkb07x FQAHAERYPG A TFLEI GPNQDLSPW DG 

atfkb08x FHAHTQRYPD A VFVEI GPGQDLSPLV DG 

atnidOla FADAVQTAHD QR.TTTYLEI GAHPTLTTLL HHTLDNP 

atnid03a FADAVQTAHH QG.TTTYLEI GPHPTLTTLL HHTLDNP 

atnid02a FADAVQTAHD QR.TTTYLEI GPHPTLTTLL HHTLDNP 

atnidOOa FADAVQTAHH QG.TTTYLEI GPHPTLTTLL HHTLDNP 

atfkblOa YADAVRELAD LG.VNMFVAV GPS GALAS AA SENTGGSAGT YH 

atrapl4a FQDAVRELAE QG.VGTFVEV GPS GALAS AG VECLGGDA.S FH 

atmonO 6a FQPAIAQVAD S . . AGVFVEL GPAPVLTTAA QHTLDE . SD . . SQES 

atmon08a FQPGIAQVAS T.. AGVFVEL GPGPVLTTAA QHTLDDVTDR HGPEP 

atmonOSa FQPGVAQVAA E..ARAFVEL GPGPVLTAAA QHTLDHITEP EGPEP 

atepo02a FADGVKALHE AG.AGTFVEV GPKPTLLGLL PACLPEAEP 

atepo03x FADGVKALHE AG.AGTFVEV GPKPTLLGLL PACLPEAEP 

atepo08a FADGVKALHA AG. AGLFVEV GPKPTLLGLV PACLPDARP 

atepoOOa FADGVKALHA AG.AGTFVEV GPKSTLLGLV PACMPDARP 

atepo04a FGDGAKALHA AG.AATFVEV GPKPVLLGLL PACLGEADA 

atnid07a FAAAVRAARA AG. AATFVEL GPDAVLSGMA RECAAG DTGT 

attyl07a FADAVRTAHR LG.ARTFLET GPDGVLCGMA EECLED DTVA 

atsor02a FLDGVRALHA EG.ARVFLEL GPHAVLSALA QDALGQ D.EGTS 

atsorbla FLDGVRTLHA EG.ARAFLEL GPHPVLSALA QDALGH D.EGPS 

atnys09a FADGIDWLA. RHDTTAFLEL GPDGVLSAMA QDCLDA A. DAD. 

atnysl2a FADGIDWLAT QGDVHTFLEL GPDGVLSAMA RESLTD ' P.SRT. 

atnysl6a FADGVRTLAE RG.ATAFLEI GPDGVLSALA RGVL P.AEA. 

atnys!7a FADGVRTLAE RG.ATAFLEI GPDGVLSALA AACL.F D.TDA. 

atnys03a FADGVTALTD RG.VTTLVEL GPDGVLSAMA QESL .P.DGA. 

atnysl5a FADGIRALTD AG.VGAFLEL GPDGTLAALA QQSA P.D.A. 

atnys07a FADGVTALEA EG.VRTFLEL GPDGVLAAMA GASL T.ESS. 

atnys08a FADGITTLEA EG.VRTFLEL GPDGILSALA QQSL A.GEA. 

atnysOSa FADGVSTLEN EG. VTTFLEL GPDGVLSAMA QQSL T.GDA. 

atnys06a FADGVTALEA EG.VRTFLEL GPDGVLAAMA RETV A. DDT. 

atnys04a FADGIRTLAD RG.VTTFVEL GPDSVLSAMA QESA P. EGA. 

atnysl4a FADGVRTMAD RG.VHLFLEL GPDAVLSAMA RQCA P.D.A. 

atnysOOa FADGITALAK AG.ADVLIEL GPGGVLSAMA RDTL.G P.DST. 

atnyslOa FADCVRTLRD AG . ATTFLEL GSDGLLTAMA EDTL.G D.DHD. 

atnysl8a FGDGVRALAD RG.VRTFLEL GPDGVLSALV RENL P.EPG. 

atnysl3a FADGVRALHD AG. AGTFVEI GPDGVLTALT QQTLDT V.EAGA 

atavelOa FADGISWLQE QG.VTTCLEI GPDGTLSALA QDSLSA P 

atrif02a FSDAVTALGA QG.ASTFLEL GPGGALAAMA LGTLGG P.EQSC 

atmon03a FHDGLRALSE QGWR.YLEL GPDPVLATMV QDGLPA P.AEGE 

atavel2a FGDAISRLHT DG . VRTFMEL GPDGTLSALA EECLEATADS HPADD . DTGT 

atrif09a FAEGVAAATE SGG.SLFVEL GPGAALTALV EET 

atmonOOa FLSGVRGLCE RG.VTTFVEL GPDAPLSAMA RDCFPAPADR SRPRP 

attyl03a FLDAMRTLRA DG. IDTFVEL GPDGVLSAMA RDCADDRPDG DTTGAGDGET 
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4£)1 500 
TADTVIMGTL RRGQGTLDHF LTSLAQLRGH GE..TSATTV LSARLTALSP 
SSAAW.PTL QRGQGGMRRF LLAAAQAFTG GV. .AVDWTA AYDDVGA.EP 

QGGAAV.GSL RRGQDERATL LEALGTLWAS G..YPVSWAR LFPAGG 

QGGAAV.GSL RRGQDERATL LEALGTLWAS G . . YPVSWAR LFPAGG 

RAGAAV.GSL RRGQDERPAM LEALGTLWAQ G..YPVPWGR LFPAGG 

REGVAV.GSL RRGQDERLSM LEALGALWVH G. .QAVGWER LFSAGGAGL. 
RRGWL.PSL RRNEDERGVM LDTLGVLYVR G. .APVRWDN VYPA. . .AF. 

. E.RTV. ASL RTDDGGWDRF LAALAQAWTQ GA. . DVDWTT LIE PA 

.E.RTV.ASL RTDDGGWDRF LTALAQAWTQ GA.. DVDWTT LIAPa! ! ! ! ! 

.DWTV.ATL RRDDGDATRM LTALAQAYVH GV. .TVDWPA ILG.T 

. DWTV.ATL RRDDGDATRM LTALAQAYVH GV. .TVDWPA ILG.T 

.DWTV.ATL RRDDGDATRM LTALAQAYVH GV. .TVDWPA ILG.T 

.DWTV.ATL RRDDGDATRM LTALAQAYVH GV. .TVDWPA ILG.T. . . ! ! 

.DWTV.ATL RRDDGDATRM LTALAQAYVE GV. .TVDWPA VLG.T 

.DWTV.ATL RRDDGDATRM LTALAQAFVE GV. .TVDWPA ILG.T. !!!! 
.DWTV.ATL RRDDGDATRM LTALAQAYVH GV. .TVDWRA VLGDV. . . ! ! 
...TTV.GTL RR . GGGADRV LDSLAKAHTV GV. .AVDWST WAATGAADD 

RSVHAT . GTL RRQDDSPHRL LTSTAEAWAH G.. ATLTW 

RSVHAT.GTL RRQDDSPHRL LTSTAEAWAH G. .ATLTW ! 

RSVHAT • GTL RRQDDSPHRL LTSTAEAWAH G. .ATLTW ! 

RSVHAT.GTL RRQDDSPHRL LTSTAEAWAH G. .ATLTW !!! 

RSVHAT.GTL RRQDDSPHRL LTSTAEAWAH G. .ATLTW !!! 

TDAAVL . GTL RRRHGGPRAL ALAVCRAFAH GVE. .VDPEA VF !! 

GHGTVM.HTL RRQKGSAKDF GMALCLAYVN GLE..IDGEA LF 

VAATAL.HTL QRGAGGLDRV RNAVGAAFAH GVR. .VDWNA LF 

ADLSAI.HSL RRGDGSLADF GEALSRAFAA GVA. .VDWES VH 

MPATW . PTL RRDHGDTTQL TRAAAHAFTA G . . ADVDWRR WF 

IPATW.PTL RRDHGDTTQL TRAAAHAFTA G. .APVDWRR WF ! 

IPATW.PTL RRDHGDTTQL TRAAAHAFTA G.. ADVDWRR WF 

IPATW.PTL RRDHGDTTQL TRAAAHAFTA G . . ATVDWRR WF 

GTAVTI . PTL RRDHGDTTQL TRAAAHAFTA G.. APVDWRR WF 

MPATW. PTL RRDHGDAAQL TRAAAQAFGA G. .AEVDWTG WF 

VPAT W . PTL RRDHGDTTQL ARAAAHAFAA G . . ADVDWRR WF 

VDAVTV.PTL RREDGGRARL ARSLAQAFGA G . . CAVRWEN WF 

SDAAVL.GTL ERDAGDADRF LTALADAHTR GVA. .VDWEA VL 

ADAVAI.GSL HRDTAE.EHL IAELARAHVH GVA. .VDWRN VF 

..VTAI.GSL RRGDNDTRRF LTALAHTHTT GIGTPTTWHH HY 

..VTAI.GSL RRGDNDTRRF LTALAHTHTT GIGTPTTWHH HY 

. . ITAT . GSL RRGDNDTHRF LTALAHTHTT GIGTPTTWHH HY 

. . ITAT.GSL RRGDNDTHRF LTALAHTHTT GIRTPTTWHH HY !! 

VTAVAT . GTL RRDQGGAGRF LLSAAEVFVR GV. . DVDWAG AF ! ! ! 

VPAVAA . GTL RRDQGGTDRF LLSAAEVFVR GV. .DVDWAG LF 

. . AWT . GTL RRDDGGVRRL LTSMAELFVR GV. .PVDWAT MA 

. . AWT. GTL RRDDGGLRRL LTSMAELFVR GV. .RVDWAT LV 

TDAWT.GSL RREEGGLRRL LTSMAELFVR GV. . DVDWAT MV 

TAAWT.GSL RRDDGGLRRL LTSMAELFVR GV. .EVDWTS LV ! 

..AWT. GSL RRDDGGLRRL LASAAELYVR GV. .AVDWTA AV 

..AWT. GSL RREDGGLRRL LTSMAELYVQ GV. .PLDWTA VL 

TEAWT.GTL RREDGGLRRL LASAAELFVR GV. .TVDWSG VL 

AI .GTL RREDGGLRRL LASMGELFVR GI..DVDWTA MV 

GSAWL . GSL RRDEGGPRRF LTSLAEAHTH GA..PVDWTT TF 

PNTAVT . GTL RRGDGGARRF TRSLAELWVR GV . . PVSW 

LDSLW.GSL RRGEGGLRRF LMSVAELFVG GV. .AVEWSG VF 

.DAWV.GSL HRDGGDLSAF LRSMATAHVS GV..DIRWDV AL 

. GAVAV.GSL RRDDGGLRRF LTSAAEAQVA GV. .PVDWAA LC 

AGACW.GTL RRDRGGLADF HTALGEAYAQ GV. .EVDWSP AF 

AEVTCV.PTL RREQSGPHEF LRNLLRAHVH GVGADL 
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PQTH. 

PQTH. 

PQTH 

PQTH 

PQTH 



IAML HGD.HE. . . 
VAML HGD.HE. . . 
VAML HGD.HE. . . 
, VAML HGD.HE. . . 
VAML HTD.HE. . . 
,VAML HGD.HE. . . 
.IPVL HGE.DE. . . 
.IPTQ TGTPEE. . . 
, I ALQ NGTADE . . . 



^ . ^^LLTNL AK TT T . . TWHPHHY 

LLTNL AK TT T. . TWHPHHY 

; LLTNL AK TT T . . TWHPHHY 

LLTNL AK TT T . . TWHPHHY 

LLTNL AK TT T . .TWHPHHY 

. ..AQAAVGAL AHLYVNG.VS V..EW.SAVL 
. . .AQAAVGAL AHLYVNG.VS V..EW.SAVL 
. ..AQAAVSAL AHLYVNG.VT V..DW.PALL 
. . . IQAAIGAL AHLYVNG.VT V..DW.PALL 
. . . AQAAISAL AHLYVNG.VT V. .DW.TALL 
. . .TQAAIGAL AHLYVNG.VT V.. DW.TALL 
. . .ARSAMTAL ARLHTGG.VA V. .DW.PEVI 
. . . VQALHTAL ARLHTRG . GV V . . DW . PTVL 



..VHALHTAL ARLFTRG . AT L..DW.SRIL 

TTIPTL HREHPEPETL TTAL AT ..LHTTGHTT T 

TTIPTL HREHPEPETL TTAL AT ..LHTTGHTT T 

TTIPTL HREHPEPETL TTAL AT ..LHTTGHTT T 

TTIPTL HRERPEPETL TQAI AA VGVRTDGIDW A 



AVL RARTGEES . . 

AVL RPRSPEDV . . 

VLVASL AGERPEES . . 

VLVSSL AGERPEES . . 

WTASL HPDRPDDV. . 

TLLASL RAGREEA. . . 

TLLASL RAGREEA... 

VLLPAS RAGRDEA... 

ALLASS RAGRDEP... 

VLVPSL RADRSEC. . . 

AFAAALRRGR PEC . . . 

LLPAIHKPGT APHGPAA. . . 
PCAFL . . PTL RKGRDDA. . . 
PCAFL. - PTL RKGRDDA. . . 
. AVTL. . PAL RAGRPEE. . . 
.AL.L..PTL RGDRPEE... 
.L.VT. .PTL RKDRDEE. . . 
.E.W..PAL RKGRPEE... 
. A. AV. .PLL RKDRPEE... 
.V.SV..PVL RKDRDEE... 
.L.AV..PLL RKDRPEE... 
.V.TV..PVL RKDRGEE. . . 
.A. TV.. PAL RKDRDEE... 
.V.TV..PVL RRNMPEE... 
.G.TI..PLL RRDRPEE... 
.V.W..PAL RRNRDED. . . 
.TDW..PAL SKGRPEE... 
.AELV. . PML RAGRAEE... 
.LVAV. .PVL RKERPEE. . . 
PAVW . . PLQ RRDRAGD. . . 
.ARAI . . PAL RPDQPEA. . . 

V. .ATL RKNGAEV. . . 

EPEPWAAAL RSKHDEG. . . 
PQENLLIPLL RPDSPEP. . . 
. AEVTCVAAL RDDRPEV. . . 
AAIATC RRGRDEV. . . 



AALTAV AELHAHG.AP V. .DL.AAVL 

CLMTAI AELHAGG.TA I..DW.AKVL 

AFVEAM ARLHTAG.VA V. . DW.SVLF 

AEVEAM ARLHTAG.VA V.. DW.SVLF 

AFAHAM ADLHVAG. IS V..DW.SAYF 

. . . AGVLEAL GRLWAAGGS . V . . SW . PGVF 
. . .AGVLEAL GRLWAAGGS . V. . SW. PGVF 
. . . ASALEAL GGFWWGGS . V . . TW . SGVF 
. . .ATVLEAL GGLWAVGGL. V..SW.AGLF 
. ..EWLAAL GAWYAWGGA. L..DW.KGVF 
. . . ATVLPAA ATAFVQG.AH V. .DW.AAPY 
. . . PGALRAA AAAYGRG.AR V. . DW . AGMH 
...EAFTAAL GALHAAG.LT P..DW.SAFF 
. ..EAFTAAL GALHAAG.LT P..DW.NAFF 
. . . HTLTTAL AGLHVHG.AT L..DW.TGCF 
. . . PALVTAV AAAHAHG.AR V. .DW.SGYF 
. . . SALLAGL ARLHVAG.VT V..DW.SAAL 
. . . HTALTAA AQLHVAG.VD I. . DW . T AVL 
. . . LSAVTGL ARAHVRG.VT V. .RW.AGLF 
...PAAVAAL ARLHT AG . VP V. . DW.TAFY 
. . . PAALAAL AQLHIAG.AR V. .DW.PVLF 
...STALTAR AHLHTRG.LI E..DW.QDFF 
. . . TSALTAL AHLHTAG.LR V. .DW.AAFF 
. . . RTLLTAL GRLHTTG .TP I . . DW . AALL 
...QAVLAAL CHLQVLG.VE A..DW.SATF 
. . . ETLVGAV ARLHVHG.AG P. .RW.DAYF 
...TAFAGAL GRLHTLG . VP V..DW.PAFY 
...LAAATAL ARLQVRG.VD V..DW.AAYL 
...TTVLAAL GTLWAHG . AD V..DW.DAVF 
. . . LALLEGL ATLHTHG . TG ' P . - SW . PAYF 
. . . RSVMTAL AELFVAG.TA V. .EW.AGVF 
. . . PDVLTAL AELHVRG.VG V. .DW.TTVL 
. . . RTLLGAV AALHTDG.QP A. .DL.TALF 
. . . GTLLTGL ARLHTHGAAA V . . NW . PAAL 
„ . . TALITAV AELFVRG.VA V. . DW . PALL 
...ATFLRSL AQAYVRG.AD V..DF.TRAY 



PDPLLTLPLL RRSVPETGDA EH PGG FERAL ATAYAHGV PLRL 
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ataveOOx TQQQSLLLDL 
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a t f kb 0 4 x AASVTAHDTG 
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attyl02p . . . DPALPPG 

attylOOp .DPALPPG 

atnid05b .G PGA 

attyl05b .G PDS 

atnid06x EG TGA 

atdebsOlp LG TGA 

atmon02p .... PADPAP 

atmonlOp PADPTP 

atmon04p .... PADPTP 

atmon07p PADPTP 

atmonllp . . . . PADPTP 

atmonl2p .... PAVPLP 

a tmonO 5b PADPAP 

a tmonO Ip PATGT . 

atdebs02p , GRA 

atdebs06p PAA 

ataveOlp THHHTHPHPH 

atave07p THHHTHPHNH 

a taveO 6p TQTHPHPNPH 

atave09p TQTHPHPHNH 

atnysOlp E GTGA 
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atrif08p PPA 

atriflOp PPA 

atrif03p PAA 

atrif06p PRT 

atrif04p PPS 

atrifOlp PAA 

atnys02p A RSAY 

atfkb02p' P FGEL 
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at debs 0 4p ... RPAVAGG 
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RRVPLPTYPW 

RRVPLPTYPW 

RRVPLPTYPW 

RRVPLPTYPW 

ESMPLPSTAG 

RVLDLPTYPF 

RLLDLPTYPF 

RVLDLPTYAF 

RVLDLPTYAF 

RVLDLPTYAF 

RVLDLPTYAF 

RVLDLPTYAF 

RVPDLPTYAF 

RVLDLPTYAF 

TAHDLPTYAF 

HLTTLPTYPF 

HLTTLPTYPF 

HLTTLPTYPF 

HLTTLPTYPF 

HLTTLPTYPF 

RPVELPTYPF 

RRVNPPTYPF 

RRVPLPSYAF 

RRVPLPTYPF 

RTIDLPTYAF 

RTVDLPTYAF 

RTVDLPTYAF 

RTIDLPTYAF 

RTVDLPTYAF 

RWDLPTYAF 

RTVDLPTYAF 

STVELPTYAF 

GLVDLPGYPF 

PPVALPNYPF 

THLDLPTYPF 

. HLDLPTYPF 

THLDLPTYPF 

.HLDLPTYPF 

ARVDLPTYAF 

SRIDLPTYAF 

. RVELPTYAF 

. RVDLPTYAF 

. RVDLPTYAF 

. RADLPTYAF 

GWVDLPTYAF 

GRVDLPKYAF 

RRVELPTYAF 

GWVDLPTYAF 

QPVDLPTYPF 

RGVPLPTYPF 

CGVELPTYAF 

APFALPTYPF 

GWVDLPTYAF 

RPVELPVYPF 

RPAELPTYPF 



DDGN 
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QHERCWIEVE 
QHERYWIEDS 
QRERYWIEAP 
QRERYWVDAP 



RER ~ ~ 

PDARR~ — * 

VHGSKPSLRL RQLRNGATDH 
AKSAAGDRRG VRAGGHPLLG 
TGGAAGGSRF AHAGSHPLL- 



DHKRYWLQPA 

DHKRYWIEAT 

QHQRYWVE . . 

QHQRYWLR . . 

QHQRYWVK . . 

QHQRYWLK. . 

QHQRYWLK. . 

QHQRFWAE . . 

QHQRYWAEAG 

HHERYWIEPA 

NHHHYWLDTT 

NHHHYWLDTT 

NHHHYWLDTT 

NHHHYWAVTS 

NHHHYWLDTI 

QRERYWCHP. 

QRERYWYHPT 

HRDRFWLPTA 

QRERVWLEPK 

QRRRYWLADT 

QHQHYWLERS 

QHQHYWLEEP 

QRRSYWL . . P 

QHKHYWVEPP 

QRERFWLEGR 

QRQDFWPAPA 

QRRRYWLEAP 

QGKRFWLLPD 

EPQRYWLAPE 

QHQHYWLESS 

QRQHYWLD . A 

QHQHYWLQPP 

QHQHYWLQ — 

QRERYW.NTR 

QHEHLW.AVP 

DHQHFW . . LS 

DHQHFW. .LR 

DHQHYW . . LR 

DHEHYW . . LR 

DRRHFW. .LH 

DHRHYW. .LR 

DHQHYW . . LQ 

EHRHYW. .LE 

QRQDFWPEAR 

RRDRYWVDAE 

ERERFWLDVE 

QRKRYWLQPA 

QRERYWVAPA 

QRQRYWLPIP 

EHQRFWPRPH 



PVT 

GAADLTALGL 
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